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Summary. We study a dynamical random network model in which at every construction 
^ step a new vertex is introduced and attached to every existing vertex independently with a 

probability proportional to a concave function / of its current degree. We give a criterion 
for the existence of a giant component, which is both necessary and sufficient, and which 
becomes explicit when / is linear. Otherwise it allows the derivation of explicit necessary 
and sufficient conditions, which are often fairly close. We give an explicit criterion to decide 
when there is a giant component, which is robust under random removal of edges. We also 
determine asymptotically the size of the giant component and the empirical distribution of 
component sizes in terms of the survival probability and size distribution of a multitypc 
branching random walk associated with /. 
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1 Introduction 

1.1 Motivation and background 

Since the publication of the highly influential paper of Barabasi and Albert [BA99) the pref- 
erential attachment paradigm has captured the imagination of scientists across the disciplines 
and has led to a host of, from a mathematical point of view mostly nonrigorous, research. The 
underlying idea is that the topological structure of large networks, such as the World- Wide- Web, 
social interaction or citation networks, can be explained by the principle that these networks 
are built dynamically, and new vertices prefer to be attached to existing vertices which already 
have a high degree in the existing network. 

Barabasi and Albert [BA99J and their followers argue that, by building a network in which every 
new vertex is attached to a number of old vertices with a probability proportional to a linear 
function of the current degree, we obtain networks whose degree distribution follows a power 
law. This degree distribution is consistent with that observed in large real networks, but quite 
different from the one encountered in the Erdos-Renyi model, on which most of the mathematical 
literature was focused by this date. Soon after that, Krapivsky and Redner [KROlj suggested 
to look at more general models, in which the probability of attaching a new vertex to a current 
one could be an arbitrary function / of its degree, called the attachment rule. 

In this paper we investigate the properties of preferential attachment networks with general 
concave attachment rules. There are at least two good reasons to do this: On the one hand 
it turns out that global features of the network can depend in a very subtle fashion on the 
function / and only the possibility to vary this parameter gives sufficient leeway for statistical 
modelling and allows a critical analysis of the robustness of the results. On the other hand we 
are interested in the transitions between different qualitative behaviour as we pass from absence 
of preferential attachment, the case of constant attachment rules /, effectively corresponding 
to a variant of the Erdos-Renyi model, to strong forms of preferential attachment as given by 
linear attachment rules /. 

In a previous paper [DM09] we have studied degree distributions for such a model. We found 
the exact asymptotic degree distributions, which constitute the crucial tool for comparison with 
other models. The main result of [DM09J showed the emergence of a perpetual hub, a vertex 
which from some time on remains the vertex of maximal degree, when the tail of / is sufficiently 
heavy to ensure divergence of the series ^ l//(n) 2 . In the present paper, which is independent 
of [DM09J , we look at the global connectivity features of the network and ask for the emergence 
of a giant component, i.e. a connected component comprising a positive fraction of all vertices 
present. 

Our first main result gives a necessary and sufficient criterion for the existence of a giant compo- 
nent in terms of the spectral radii of a family of compact linear operators associated with /, see 
Theorem |1.1[ An analysis of this result shows that a giant component can exist for two separate 
reasons: either the tail of / at infinity is sufficiently heavy so that due to the strength of the 
preferential attachment mechanism the topology of the network enforces existence of a giant 
component or the bulk of / is sufficiently large to ensure that the edge density of the network 
is high enough to connect a positive proportion of vertices. We show that in the former case 



the giant component is robust under random deletion of edges, whereas it is not in the latter 



case. In Theorem 1.5 we characterise the robust networks by a completely explicit criterion. 



Further results show that the asymptotic size of the giant component is determined by the sur- 



vival probability of a random tree associated with /, see Theorem 1.7 and the proportion of 



components with a given size is given by the distribution of the total number of vertices in this 



tree, see Theorem 1.8 



The general approach to studying the connectivity structure in our model is to analyse a pro- 
cess that systematically explores the neighbourhood of a vertex in the network. Locally this 
neighbourhood looks approximately like a tree, which is constructed using a spatial branching 
process. The properties of this random tree determine the connectivity structure. It should be 
mentioned that although the tree approximation holds only locally it is sufficiently powerful to 
give global results through a technique called sprinkling. 

This approach as such is not new, for example it has been carried out for the class of inhomo- 
geneous random graphs by Bollobas, Janson and Riordan in the seminal paper [BJR07J. What 
is new here is that the approach is carried forward very substantially to treat the much more 
complex situation of a preferential attachment model with a wide range of attachment functions 
including nonlinear ones. The increased complexity originates in the first instance from the fact 
that the presence of two potential edges in our model is not independent if these have the same 
left end vertex. This is reflected in the fact that in the spatial branching process underlying 
the construction the offspring distributions are not given by a Poisson process. Additionally, 
due to the nonlinearity of the attachment function, information about parent vertices has to 
be retained in the form of a type chosen from an infinite type space. Hence, rather than being 
a relatively simple Galton- Watson tree, the analysis of our neighbourhoods has to be built on 
an approximation by a multitype branching random walk, which involves an infinite number of 
offspring and an uncountable type space. In the light of this it is rather surprising that we are 
able to get results that are in several aspects finer than those obtained for linear preferential 
attachment models, see for example the recent paper of Dommers et al. [DHH10]. 

While the criterion for existence of a giant component is relatively abstract for a general at- 
tachment function, we show that it becomes completely explicit if this function is linear, see 



Proposition 1.2. Moreover, in the general case the criterion can be approximated and then allows 



explicit necessary or sufficient estimates, which are typically rather close, see Proposition 1.9 It 
is worth noting that, although our results focus on the much harder case of nonlinear attachment 
rules, they are also new in the case of linear attachment rules / and therefore represent very 
significant progress on several fronts of research. 



1.2 The model 

For any ^ 7 < 1 we call a concave function / : {0, 1,2,...} — > (0, 00) with /(0) ^ 1 and 

Af(k) := f(k + 1) - f(k) < 7 for all k ^ 0, 

a 7- attachment rule, or simply attachment rule. Observe that any / satisfying these conditions 
is increasing with f(k) ^ k + 1 for all k ^ 0. 



Given an attachment rule /, we define a growing sequence (Gn)n£N of random networks by the 
following iterative scheme: 

• The network Q\ consists of a single vertex (labeled 1) without edges, 

• at each time N ^ 1, given the network Qn, we add a new vertex (labeled N + 1) and 

• insert for each old vertex M a directed edge N + 1 — > M with probability 

/(indegree of M at time N) 

N ' 

to obtain the network Qn+i- 

The new edges are inserted independently for each old vertex. Note that our conditions on / 
guarantee that in each evolution step the probability for adding an edge is smaller or equal 
to 1. Edges in the random network Qn are dependent if they point towards the same vertex 
and independent otherwise. Formally we are dealing with directed networks, but indeed, by 
construction, all edges are pointing from the younger to the older vertex, so that the directions 
can trivially be recreated from the undirected (labeled) graph. All the notions of connectedness, 
which we discuss in this paper, are based on the undirected networks. 

Our model differs from that studied in the majority of publications in one respect: We do not 
add a fixed number of edges in every step but a random number, corresponding formally to the 
outdegree of vertices in the directed network. It turns out, sec Theorem 1.1 (b) in [DM09J, that 
this random number is asymptotically Poisson distributed and therefore has very light tails. 
The formal universality class of our model is therefore determined by its asymptotic indegree 
distribution which, by Theorem 1.1 (a) in [DM09J, is given by the probability weights 



» = 1+75,11 A *-* 6 nu{0}. 



Note that these are power laws when f(k) is of order k (but / need not be linear). More precisely, 
as k t oo, 

M^ 7e( o,i) _ =i^i + I, 

k log k 7 

so that the LCD-model of Bollobas and Riordan [BR03J compares to the case 7 = | . 

1.3 Statement of the main results 

We fix ^ 7 < 1 and a 7-attachment rule / and define a pure birth Markov process (Zt : t ^ 0) 
started in zero with generator 

Lg(k) = f(k)Ag(k), 

which means that the process leaves state k with rate f(k). Given a suitable < a < 1 we 
define a linear operator A a on the Banach space C(<S) of continuous, bounded functions on 
5:={£}U[0,oo],by 

/"OO /"OO 

A a g(T):= g(t)e at dM(t)+ g(£) e~ at dM T (t), 

Jo Jo 



where increasing functions M, resp. M r , are given by 

M(t)= [ e~ s E[f(Z s )]ds, M e (t)=E[Z t ], 

Jo 

M T (t) = E[Z t \AZ T = l] - l[ T) oo)(*) for T G t ' 00 )- 



We shall see in Remark 2.6 that M r ^ M r for all r ^ r' ^ and therefore M°° = lim T _ 



is well-defined. We shall see in Lemma 13. II that 

A a l(Q) < oo -4=> A a is a well-defined compact operator. 

In particular, the set X of parameters where A a is a well-defined (and therefore also compact) 
linear operator is an open (but possibly empty) subinterval of (0, 1). 

Recall that we say that a giant component exists in the sequence of networks (Gn)n£N if the 
proportion of vertices in the largest connected component Cn C Qn converges, for N f oo, in 
probability to a positive number. 

Theorem 1.1 (Existence of a giant component). No giant component exists if and only if there 
exists < a < 1 such that A a is a compact operator with spectral radius p{A a ) ^ 1. 

The most important example is the linear case f{k) = yk + /3. In this case the family of 



operators A a can be analysed explicitly, see Section [1.4.2 We obtain the following result. 



Proposition 1.2 (Existence of a giant component: linear case). If f(k) = yk + /3 for some 
^ 7 < 1 and < (3 ^ 1, then there exists a giant component if and only if 

1 .... (!-7) 2 



7 Js - or j3 > 



2 1-7 

This result corresponds to the following intuition: If the preferential attachment is sufficiently 
strong (i.e. y ^ |), then there exists a giant component in the network for purely topological 
reasons and regardless of the edge density. However if the preferential attachment is weak (i.e. 
7 < 2) then a giant component exists only if the edge density is sufficiently large. 

Example 1.3. If 7 = the model is a dynamical version of the Erdos-Renyi model sometimes 
called Dubins 'model. Observe that in this case there is no preferential attachment. The criterion 
for existence of a giant component is j3 > j, a fact which is essentially known from work of 
Shepp [Shc89j, sec Bollobas, Janson and Riordan [BJR05, BJR07J for more details. 

Example 1.4. If 7 = ^ the model is conjectured to be in the same universality class as the 
LCD-model of Bollobas and Riordan [BR03J. In this case we obtain that a giant component 
exists regardless of the value of /?, i.e. of the overall edge density. This is closely related to the 
robustness of the giant component under random removal of edges, obtained in [BR03J. 

As the last example indicates, in some situations the giant component is robust and survives 
a reduction in the edge density. To make this precise in a general setup, we fix a parameter 
< p < 1 , remove every edge in the network independently with probability \ — p and call the 
resulting network the percolated network. We say the giant component in a network is robust, 
if, for every < p < 1, the percolated network has a giant component. 



Theorem 1.5 (Percolation). The giant component in the network is robust if and only if 



k=0j 



i 3 + f® 



Remark 1.6. Precise criteria for the existence of a giant component in the percolated network 
can be given in terms of the operators (A a : a £ 1) as follows: 

(i) The giant component in the network is robust if and only if I = 0. 

(ii) If I 7^ then the percolated network has a giant component if and only if 

1 
min p(A a ) 

(hi) In the linear case f(k) = 7A; + /3, for 7 > 0, the network is robust if and only if 7 ^ k- 
Otherwise, the percolated network has a giant component if and only if 



p>(£-i)UA + i-i ■ w 



,2 

Observe that running percolation with retention parameter p on the network Qn with attachment 
rule / leads to a network which stochastically dominates the network with attachment rule pf. 
Only if / is constant, say f(k) = (3, these random networks coincide and the obvious criterion 
for existence of a giant component in this case is p > Jw . This is in line with the formal criterion 
obtained by letting 7 \. in . 

We now define a multitype branching random walk, which represents an idealization of the 
exploration of the neighbourhood of a vertex in the infinite network Qoo and which is at the 
heart of our results on the sizes of connected components in the network. Particle positions are 
on the real line and types are in the space S. The initial particle is of type £ with arbitrary 
starting position. Recall the definition of the pure birth Markov process (Zt '■ t ^ 0) and denote 
the associated semigroup by (Pt- t ^ 0). For r ^ 0, let (Z t : t ^ 0) be the same process 
conditioned to have a birth at time r. This process can be formally defined via its compensator 

( J''" f(Z u ) PT p uf{ ^ y \ 1] du + l [T|0o) (t) + f f(Z u )du:t>0). (2) 

V0 ^T-Uj{^u) Jtf\T ' 

Each particle of type £ in position x generates offspring 

• to its right of type £ with relative positions at the jumps of the process (Z t : t ^ 0); 

• to its left with relative positions distributed according to the Poisson point process II on 
(—oo,0] with intensity measure 

e*E[/(Z_ t )] dt, 

and type being the distance to the parent particle. 
Each particle of type r ^ in position x generates offspring 




types are distances to x 



(Zt) 
^-type particles 



Figure 1: Offspring of an ^-type particle in the branching random walk. A particle generates 
finitely many offspring to its left, but infinitely many offspring to its right. 

• to its left in the same manner as with a parent of type £; 

• to its right of type I with relative positions at the jumps of (Z\ — lr T)00 )(£) : t ^ 0). 

This branching random walk with infinitely many particles is called the idealized branching 
random walk (IBRW) . Note that the functions M featuring in the definition of our operators A a 
are derived from the IBRW: M(t) is the expected number of particles within distance t to the 
left of any given particle, and M T (t) is the expected number of particles within distance t to the 
right of a given particle of type r. 

Equally important to us is the process representing an idealization of the exploration of the 
neighbourhood of a typical vertex in a large but finite network. This is the killed branching 
random walk obtained from the IBRW by removing all particles which have a position x > 
together with their entire descendancy tree. 

Starting this process with one particle in position xq < (the root), where — xq is standard 
exponentially distributed, we obtain a random rooted tree called the idealized neighbourhood 
tree (INT) and denoted by T. The genealogical structure of the tree approximates the relative 
neighbourhood of a typical vertex in a large but finite network. We denote by #% the total 
number of vertices in the INT and say that the INT survives if this number is infinite. 




types are distances to x 



(Z[ T ' - l{,.>r}) 

^-type particles 



Figure 2: Offspring of a particle of type r £ [0, oo) in the branching random walk. Offspring to 
the right have type £, offspring to the left have type given by the distance to the parent. 



The rooted tree X is the weak local limit in the sense of Benjamini and Schramm [BS01] of the 
sequence of graphs in our preferential attachment model. An interesting result about weak local 
limits for a different variant of the preferential attachment network with a linear attachment 
function, including the LCD-model, was recently obtained by Berger et al. [BBCS09J. In the 
present paper we shall not make the abstract notion of weak local limit explicit in our context. 
Instead, we go much further and give some fine results based on our neighbourhood approxima- 
tion, which cannot be obtained from weak limit theorems alone. The following two theorems 
show that the INT determines the clustering structure of the networks in a strong sense. 



Theorem 1.7 (Size of the giant component). Let f be an attachment rule and denote by p(f) 



the survival probability of the INT. We denote by C 
connected component oJQn- Then 



N 



and Cjy 



the largest and second largest 



#<# 



p(f) and 



#C 



0, in probability. 



N ^ w/ N 

In particular, there exists a giant component if and only if p{f) > 0. 



Relative size of giant component 




Figure 3: Simulation of the proportion of vertices in the giant component in the linear case. 



The curve forming the lower envelope is determined explicitly in Proposition 1.2 The plot is 
based on 15.000 Monte Carlo simulations of the branching process for 80 times 80 gridpoints in 
the (/3, 7)-plane. 



The final theorem shows the cluster size distribution in the case that no giant component exists. 
In this case typical connected components, or clusters, are of finite size. 

Theorem 1.8 (Empirical distribution of component sizes). Let f be an attachment rule and 
denote by Cn{v) the connected component containing the vertex v £ Qm- Then, for every k G N, 

1 - 

- ^ l{#C N (v) = k} — > P{#X = k} in probability. 



N 



v=l 



1.4 Examples 

1.4.1 Explicit criteria for general attachment rules 

The necessary and sufficient criterion for the existence of a giant component given in terms of 
the spectral radius of a compact operator on an infinite dimensional space appears unwieldy. 
However a small modification gives upper and lower bounds, which allow very explicit necessary 
or sufficient criteria that are close in many cases, see Figure 4. 

Proposition 1.9. Suppose f is an arbitrary attachment rule and let 

oo k ,, , ,s oo k 



a 



m 



l/I-EIIr+n,, 

k=0j=0 2 tJ ^ 



and c[f] =Y,U i J , KJ <,'/.'^ > a ^- 



f(j + 1) 
k=0j =0 2+f(J + l 

(i) If a[f] > g) then there exists a giant component. 

(ii) If g (o[/] + -\/a[/]c[/]) ^ 2 then there exists no giant component. 



0.2 0.3 



Figure 4: For the attachment function f(k) = ^yk + j3 the figure shows the curves a[f] = \ 
and a[f] + a/o.[/]c[/] = 1, which form lower and upper bound for the boundary between the two 
phases, nonexistence and existence of the giant component, in the (/3, 7)— plane. We observe a 
remarkable closeness of our bounds. 



Remark 1.10. 



The term |(o[/] + y/a[f\c[f\) differs from a[f] by no more than a factor of 



1 + 



/(o) y 



A giant component exists if liminf ^-^- Js ^, as this implies divergence of the series a[f]. 



• 



If the series a[f] converges, for example because limsup ^p- < |, then there exists e > 
depending on /(l), /(2), . . . such that no giant component exists if /(0) < e. 



Proof of Proposition |1.9[ (i) For a lower bound on the spectral radius we recall that 
M T Js M and therefore we may replace M T in the definition of A a by M . Then A a g(r) no 
longer depends on the value of r G [0, oo] but only on the fact whether r = £ or otherwise. 
Hence the operator collapses to become a 2 x 2 matrix of the form 

, a(a) a(a) 
b(a) b(a) 



with 



/■OO /*00 

a( a ) = / e- at Ef(Z t ) dt, 6(a) = / e^ 1 )'E/(Z t ) dt. 
Jo Jo 



Recalling that {Z%: t ^ 0) is a pure birth process with jump rate in state k given by f(k), we 
can simplify this expression, using T/% as the entry time into state k, as follows 



oo 


■oo 


er at Ef(Z t ) dt = 


= e >; 




fc=0 




oo 



Tk+i 

at 



f(k) I e~ at dt 

T k 

= Y,f( k )a-[ Ee ~ aTk - Ee ~ aTk+1 ]- 
k=0 

Recalling that T/% is the sum of independent exponential random variables with parameter f(j), 
j = 0, . . . , k — 1, we obtain 



and hence 



and similarly, for < a < 1, 



, :0 /(i -!)+«' 



OO K n f . \ 

fc=0j=0 ^ w/ 






Z ^ ii /(j) + 1 -a' 
fc=0j=0 J uy 



Now note that p(A) = a(a) + b(a) and this is minimal for a = h, whence a(a) = b(a) = a[f]. 
This shows that the given criterion is sufficient for the existence of a giant component. 

(ii) For an upper bound on the spectral radius we use Lemma 2.5 to see that M T ^ M° and 



therefore we may replace M T in the definition of A a by M°, again reducing the operator A a to 
a 2 x 2 matrix which now has the form 

■ a(a) a(a) 
c(a) a(a) 



10 



with a(a) as before and 

poo 

c{a)= / e- at ¥}[f{Z t )}dt, 
Jo 

where E 1 is the expectation with respect to the Markov process {Z% : t ^ 0) started with Zq = 1. 
As before we obtain 

c(a) = E 1 [Y J f(k) e- at dt] = ]T /(fc)I [E 1 ^] - EV*^ 1 ]] 

fe=i ^ Tfc fe=i 

.fn /(J + 1) 

^ J - J -/0' + l) + a" 

fc=o i=o ^ u ; 

Note that p(A) = a(a) + y / a(a)c(a), so that choosing a = ^, which implies o(a) = a[f] and 
c(a) = c[f], gives the result. □ 

1.4.2 The case of linear attachment rules 

We show how in the linear case f(k ) = , yk + f3 the operators (A a : a £ I) can be analysed explic- 
itly and allow to infer Proposition 1.2 from Theorem |l.l| We write F k and E fc for probability 



and expectation with respect to the Markov process (Zt : t ^ 0) started with Zq = k. 
Lemma 1.11. For f(k) = ~fk + j3 we have, for all k ^ 0, 

E fc [/(Z t )] = f(k)e*, E fc [/(Z t ) 2 ] = (f(k) 2 + f(kh) e 2 ^ - /(%e* 
and therefore 

dM(t) = Peb- 1 * dt, dM i (t) = j3e lt dt, dM T (t) =(/3 + 7 )e 7 *dt /orre[0,oo]. 
Proof. Recall the definition of the generator L of (Zt : t ^ 0). The process (Xj : t ?2 0) given by 



X t = f(Z t ) - [ Lf(Z s ) ds = f{Z t ) - 7 / f(Z t 
Jo Jo 



) ds 



is a local martingale. Let (r n ) nS N be a localising sequence of stopping times and note that 

rtAr n f-t 

E k [f(Z t )] = lim E k f(Z tATn ) = f(k) + 7 hm E fc / /(Z.) ds = f(k) + 7 / E k [f(Z s )] ds. 

We obtain the unique solution K k [f(Z t )] = f(k)e' yt . The analogous approach with / replaced 
by f 2 gives 

E fc [/ 2 (Z t )] = 7 2 f\ k f(Z s )ds + 2 1 f\ k [f 2 (Z s )]ds + f(k) 2 
Jo Jo 

= f(kh (e 7 * - 1) + 2 7 f E fc [f 2 (Z s )} ds + f(k) 2 , 

Jo 

11 



and we obtain the unique solution 

E[f(Z t )] = (f(k) 2 + /(%) e 2 ^ - f(k)ye*. 

The results for M and M^ follow directly from these formulas. To characterize M T for r £ [0, oo) 
we observe that, for t ^ r, 

E[f(Z t ) | AZ T = 1] = f>(Z T = fc)J|rE W [/(M 
fc =o ^ /(Zrj 

p7 (t-2r) °° e 7(*-2r) 

J]P(Z T = k) f(k)f(k + 1) = — — (E/ 2 (Z T ) + 7 E/(2 r )) 



and, for t < r, 



K fc=0 

7 (t-2r) 

(/3 2 + /3 7 )e 27T = (7 + /?)e 7t 



E[/(z t ) i Az r = i] = f; nzt = k) f(k) Ek l f f { f;\ t)] = f; p(z t = *o /(*) ^ e -* 



E/(Z T ) ^ — /(0) 



-7* 



-E[/ 2 (Z 4 )]=( 7 + /3)e^- 7 . 



/3 
From this we obtain 

VT(t) = E[ZF] - l [T>oo) (t) = (f + l)e* - 1 - f , 

and, by differentiating, this implies dM T (t) = (/3 + 7)e 7 * (it. □ 



Proof of Proposition 1.2 As M T depends only on whether r = £ or not, the state space 5 
can be collapsed into a space with just two points. The operator A a becomes a 2 x 2-matrix 
which, as we see from the formulas below, has finite entries if and only if 7 < a < 1 — 7. This 
implies that there exists a giant component if 7 ^ g, as in this case the operator A a is never 
well-defined. Otherwise, denoting the collapsed state of [0, 00) by r, the matrix equals 

M x = r e^ +a -^dt = ^ , for q 6 {*,£}, 

Jo 1 - 7 - a 

Jo a - 7 

^ = (/3 + 7 ) r e (T-"yt dt = £+i m 

Jo a -7 

Then /o(A a ) is the (unique) positive solution of the quadratic equation 

x 2 (l - 7 - a) (a - 7) - xifi - 2/3 7 ) - /3 7 = 0. 

This function is minimal when the factor in front of x 2 is maximal, i.e. when a = \. We note 
that 

K^) = — r— — > 

2 ' 
which indeed exceeds one if and only if 

1-7 
12 



1.5 Overview 

The remainder of this paper is devoted to the proofs of the main results. In Section\^we discuss 
the process describing the indegree evolution of a fixed vertex in the network and compare it to 
the process (Zj: t ^ 0). The results of this section will be frequently referred to throughout the 
main parts of the proof. Section [3] is devoted to the study of the idealized branching random 
walk and explores its relation to the properties of the family of operators (A a : a £ X). The 
main result of this section is Lemma 13.31 which shows how survival of the killed IBRW can be 



characterised in terms of these operators. Two important tools in the proof of Theorem 1.1 



arc 



provided in Section [7J namely the sprinkling argument that enables us to make statements about 



the giant component from local information, see Proposition 4.1, and Lemma 4.2 which ensures 
by means of a soft argument that the oldest vertices are always in large connected components. 

The core of the proof of all our theorems is provided in Sections [5] and [6j In Section [5| we 
introduce the exploration process, which systematically explores the neighbourhood of a given 
vertex in the network. We couple this process with a random labelled tree and show that 



this coupling is successful with high probability, see Lemma 5.2 This random labelled tree 



introduced in Subsection 5.1, is still dependent on the network size N, but significantly easier 



to study than the exploration process itself. Section [6] uses further coupling arguments to relate 



the random labelled tree of Subsection 5.1 for large N with the idealized branching random 



walk. The main result of these core sections is summarised in Proposition |6.1| 

In Section [?] we use a coupling technique similar to that in Section [5] to produce a variance 
estimate for the number of vertices in components of a given size, see Proposition |7.1[ Using 
the machinery provided in Sections [4] to [7] the proof of Theorem 1.7 is completed in Section^ 



and the proof of Theorem 1.8 is completed in Section^ Recall that Theorem 1.7 provides a 



criterion for the existence of a giant component given in terms of the survival probability of the 
killed idealized branching random walk. In Theorem 1 1 . 1 1 this criterion is formulated in terms of 
the family of operators (A a : a E X) , and the proof of this result therefore follows by combining 
Theorem 1 1 . 71 with Lemma 13.31 



The proof of the percolation result, Theorem 1.5 requires only minor modifications of the 



arguments leading to Theorem |1.1| and is sketched in Section 10 In a short appendix we 
have collected some auxiliary coupling lemmas of general nature, which are used in Section [6j 
Throughout the paper we use the convention that the value of positive, finite constants c, C 
can change from line to line, but more important constants carry an index corresponding to the 
lemma or formula line in which they were introduced. 

2 Properties of the degree evolution process 

We denote by Z[m,n], m ^ n the indegree of vertex m at time n. Then, for each m G N, 
the degree evolution process (Z[m,n\: n Js m) is a time inhomogeneous Markov process with 
transition probabilities in the time-step n — > n + 1 given by 

Pk,i+i = — A l and Pk,i = 1 ~ Pki+i for integers k ^ 0. 
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Moreover, the evolutions (Z[m, ■]: m G N) are independent. We suppose that under P the 
evolution (Z[m,n] : n Js m) starts in Z[m,m] = k. We write 

P m , n g(k)=E k [g(Z[m,n})} for any g: {0, 1, . . .} -+ (0,oo). 

In this section we provide several preliminary results for the process (Z[m,n] : n ^ m) and its 
continuous time analogue (Zt : t ^ 0). These form the basis for the computations in the network. 



We start by analysing the process {Z t : t ^ 0) in Section 2.1 and then give the analogous results 



for the processes (Z[m, n] : n ^ m) in Section 2.2 We then compare the processes in Section 2.3 



2.1 Properties of the pure birth process (Z t : t ^ 0) 

We start with a simple upper bound. 

Lemma 2.1. Suppose that f is a ^-attachment rule. Then, for all s,t ^ and integers k ^ 0, 

E fe [/(Z t )] ^ /(*) e^ and P t+a /(fc) ^ e^PJ(k). 

Proof. Note that {Zt : t ^ 0) is stochastically increasing in /. We can therefore obtain the 
result for fixed k ^ by using that / (n) ^ /(A;) + 7(n — A;) for n ^ k, and comparing with the 
linear model described in Lemma ll.lll □ 

The next two lemmas allow a comparison of the processes (Z^' : t ^ 0) for different values of r. 
Lemma 2.2. For an attachment rule f and integers k ^ and £ ^ 0, one aas 

Pt/(* + l) < /(*: + !) 



Pt/(fc) /(*) 

/or all t^0. Moreover, if f is linear, then equality holds in the display above. 

Proof. In the following, we work under the measure P = P +1 , and we suppose that (Uj : j ^ 0) 
is a sequence of independent random variables, uniformly distributed in [0, 1], that are indepen- 
dent of (Zt ■ t ^ 0). We denote by T\,Ti, . . . the random jump times of (Z t : t ^ 0) in increasing 
order, set To = 0, and consider the process (Zt ■ t ^ 0) starting in k that is constant on each 
interval [Tj,Tj + i) and satisfies 

Z T . +1 = Z T . + 1{Uj < f(Z T .)/f(Z T .)}. (3) 

It is straightforward to verify that (Z t : t ^ 0) has the same distribution as (Z t : t ^ 0) under P fc . 
By the concavity of / we conclude that 

f jM mHz T] -k)^0^ 

fW* f(k) + (z T .-k f z ^- m 



Z Ti -f(k) 



f(Z T .)-f(k) 
and — ~z~-k — ^ Af(k), so that 



j 






14 



Next, we couple the processes (Zt ■ J ^ 0) and (Zt : j ^ 0) with a Polya urn model. Initially 
the urn contains balls of two colours, blue balls of weight Bq = £ := f(k)/Af(k), and red balls 
of weight one. In each step a ball is picked with probability proportional to its weight and a 
ball of the same colour is inserted to the urn which increases its weight by one. Recalling that 
the total weight after j draws is j + £ + 1, it is straightforward to see that we can choose the 
weight of the blue balls after j steps as 

Now ([3]) and Q imply that whenever we pick a blue ball in the jth step, the evolution (Zt : t ^ 0) 
increases by one at time Tj. Note that (Zt : t ^ 0) is independent of (Uj : J ^ 0) so that 

E[Zt\ Z t = n + k + 1] - k > E[B n -B ] = -^-(n + C + l)-^ = -^- = J} k ) n, 

and, by the concavity of /, 

E[f(Zt)\Z t = n + k + l}^f(k)+ f{n + k + 1) n ~ f{k + 1) (E[Z t \Z t = n + k + l]-k) 

2 f(k) + (f(n + k + l)-f(k + l))y7^y (5) 

= f(k) f{n + k + 1) 



so that 



/(fc + 1) 



Ptf(k + l) _E[f(Z t )} < f(k + l) 



Ptf(k) nf(Zt)] " /(£;) 

If / is linear all inequalities above become equalities. □ 

Next, we show that the semigroup (Pt) preserves concavity. 

Lemma 2.3. For every concave and monotonically increasing g and every t Js 0, the function 
Ptg is concave and monotonically increasing. 



Proof. We use an urn coupling argument similar to the one of the proof of Lemma 2.2 Fix 
k Js and let (Z{ : t ^ 0) be the pure birth process started in Zq = k + 2. Denote Tq = and 
let (Tj : j = 1, 2, . . .) be the breakpoints of the process in increasing order. Suppose (Uj : j ^ 0) 
is a sequence of independent random variables that are uniformly distributed on [0,1]. For 
i £ {0, 1}, we now denote by (Z[ : t Js 0) the step functions starting in k + i which have jumps 
of size one precisely at those times Tj+i, j ^ 0, where 



By concavity of / we get 



u < /( ^ } 








-/(^) ^ 

-/(^)"^ 


7 (o) 

7 (0) 



p(az« +i = i|A2f +i = o) = — ',; ;;j^; ;, ^ ^ . (6) 
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Let (Tj : j = 1, 2, . . .) the elements of the possibly finite set {Tj : j ^ 1, AZjv, = 0} in increasing 
order. We consider a Polya urn model starting with one blue and one red ball. We denote by B n 
the number of blue balls after n steps. By (Rjl) we can couple the urn model with our indegree 
evolutions such that 



ABj < AZg, 



and such that the sequence (Bj)j e f,j is independent of (Z t : t ^ 0) and {Z\ : t ^ 0). Let 5 be 
the linear function on [/, Z + 2 + to] with g(l) = g(l) and p(Z + 2 + m) = <?(Z + 2 + to). Then 

E[ 5 (Z«)| Zf = I, Zf =l + 2 + m]> g(E[Z^\ Zf = Z, Z? = l + 2 + m\) 

>g(l-l + EB 2+m ) 
= g(! + l + ¥) = $[g(l) + g(! + 2 + m)]. 

Therefore, 

P t g(k + 1) = E[ 5 (Z«)] > § [Eb(Z< 0) )] + E[ 5 (Z t (2) )]] = i [ifc(fc) + P^(fc + 2)] , 

which implies the concavity of Ptg. □ 



The fact that the semigroup preserves concavity allows us to generalise Lemma 2.2 
Lemma 2.4. For an attachment rule f and integers k ^ and s, t ^ 0, one has 

P t+S f(k + 1) < PJ(k + 1) 



Pt+sf{k) " P,/(fc) 



Proof. The statement follows by a slight modification of Lemma |2.2[ We use Z and Z as in 

serve tl 

g(k) := 



the proof of the latter lemma and observe that by Lemma 2.3 the function 

P s f(k + 1) 



Psf(k) 
is concave and increasing. Similarly as in (J5J) we get 

E[g(Z t )\Z t = n + k + l}> g(k) + g(n + k + 1} - 9{k + X) (E[Z t |Z f = n + fc + 1] - fc) 

>«,(*) + G/(n + A: + l)- 5 (fc + l))- ■ /(/ '' ) 



f(k + l) 



2 g(k) + ( 5 (n + k + 1) - g(k + 1)) ^^ = 5 (n + A; + 1) //( / '' ) 



The rest of the proof is in line with the proof of Lemma |2.2| □ 

Lemma 2.5 (Stochastic domination). One can couple the process (ZJ T : t Js 0) urei/i stari in 
Zq = k and the process (Z t : t ^ 0) with start in Zq = k + 1 in suc/i a way £Zia£ 

{t > 0: AZ] Tl = 1} C {< > 0: AZ t = 1} U {r}. 

In particular, this implies that ZJ + t{t < t} ^ Zt for all t Js 0. In £/ie linear case we have 
equality in both formulas. 
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Proof. Suppose (Z l t 2> : t ^ 0) has the distribution of [Zf. t ^ 0) with start in Zq = k + 1, 
let To = and (Tj: j = 1,2,...) the times of discontinuities of (Z 4 (2) : t ^ 0) in increasing 
order. Denote by (t/j : j ^ 0) a sequence of independent random variables that are uniformly 
distributed on [0, 1]. Now define {Zf : t ^ 0) as the step function starting in k which increases 
by one (i) at time Tj + \ < r if 



/(Zg>)iV- rj+1 /(Zff + l 
(ii) at time r, and (iii) at time Tj + \ > r if 



tj. < ii J+ ^ 1 f7) 



[/ ^ M y . ( 8 ) 

Clearly, we have Z^ 1 ' + 1 ^ Z^' for all t E [0, r) and Z t (1) ^ Z t (2) for general t Js 0. Moreover, by 
Lemma 2.2 the right hand sides of the inequalities fn) and (pi) are not greater than one and it 



is straightforward to verify that the compensator of [Z\ ' : t Js 0) is the process in (I2j) so that it 
has the same law as the process (Z\ : t Js 0) with start in Zq = k. □ 

Remark 2.6. Certainly the approach from above can be used to couple two evolutions Z M and 



Z^ T > started in k for arbitrary < a ^ r. By Lemma 2.4, one then gets that 

{t > : Z w }\{r} C {t ^ : Z M }\{a}. 

2.2 Properties of the degree evolutions (Z[m,n\: n ^ m) 



For the processes (2[m, n]: n ^ m) we get an analogous version of Lemma 2.1 
Lemma 2.7. For any ^-attachment rule f , and all integers k ^ and < m ^ n, 

E k [f(Z[m,n])]^f(k) 



m 

Proof. Note that (Y n : n ^ m) with Y n := f(2[m, n]) Yl7=m(^ + ?) _1 ^ s a supermartingale. 
Hence 

n-l 
E fc [/(2[m,n])] < f(k) n(l + ?) < /(*) (^) 7 - 



We also get the following analogue of Lemma 2.2 



Lemma 2.8. For an attachment rule f and integers k ^ and < m ^ n one /ias 

Pm,n/(fc + l) /(fc + 1) 

Pm,J(fe) " f(k) ' 

If f is linear and f(k + 1 + /) ^ m + I for all I E {0, . . . , n — m — 1}, then equality holds. 
Proof. The statement follows by a slight modification of the proof of Lemma |2.2| □ 

We now provide two lemmas on stochastic domination of the degree evolutions. 
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Lemma 2.9 (Stochastic domination I). For any integers < m < n\ < ■ ■ ■ < rij the process 
(Z[m,n] : n ^ m) conditioned on the event AZ[m,ni] = for all i € {1, . . . ,j} is stochastically 
dominated by the unconditional process. 

Proof. First suppose that m < n\. For any k ^ 0, we have 

v ' m r K {AZ[m, m\ = Ovzj 

The denominator on the right is equal to 

^P fc+1 (AZ[m + l,rai] = 0Vi) + (l-^)P fc (A2[m+l,n,] = OVt) 

^P fe+1 (AZ[m + l,ni] = 0Vt), 

and hence we get 

¥ k (AZ[m,m] = l\AZ[m,ni] = OVi E {l,...,j}) ^ ^- = F k (AZ[m,m] = l), (9) 

which is certainly also true if m = n\ . The result follows by induction. □ 

The next lemma is the analogue of Lemma |2.5| 



Lemma 2.10 (Stochastic domination II). For integers ^ k < m < n there exists a coupling 
of the process (Z[m,n\: n ^ m) started in Z[m,m] = k and conditioned on AZ[m,n] = 1 and 
the unconditional process (Z[m,n\: n Js m) started in Z[m,m] = k + 1 such that for the coupled 
random evolutions, say (Z m [l] : I ^ m) and (Z {2) [1] : I ^ m), one has 

AZ w [l] ^AZ {2) [l] + l{l = n}, 

and therefore in particular i? (1) [7] ^ iJ (2) [7] for all I ^ m. 

Proof. Note that 

™fc,A^r i -lAo-r i ^ W k (AZ[m,m] =l,AZ[m,n] = 1) 

F k (AZ \m,m\ = l\AZ\m,n\ = 1) = — L ' J r ' — L -^ J 

v L ' J ' L ' J ' P k (AZ[m,n] = 1) 

_ m E k+i [f{z[m + hn])] i ^ f( k )P m+ltn f(k + l) 
E*[f(Z[m,n])]k m P m ,nf(k) 



By Lemma 2.8, we get 



p )c, A7l i r , A ^r , n ^ f(k)P m +i, n f(k + l) f(k + l) 

F (AZ[m,m\ = l\AZ\m,n\ = 1) < — — ! — r — ^ . 

m P m +l,nf{k) rn 

Now the coupling of the processes can be established as in Lemma |2.5| □ 



Lemma 2.11. For all m ^ n ^ n' one has P(AZ[m,n] = 1) ^ ¥(AZ[m,n'] = 1) 
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Proof. It suffices to prove the statement for n' = n + 1 and n ^ m arbitrary. The statement 
follows immediately from 



1 1 °° 

P(A2[m, n] = 1) = -E[f(Z[m, n])] = - ^ F(Z[m, n] = k)f{k), 

fc=0 

and 

¥(AZ[m, n + 1] = 1) = -J- fp(2[m, n] = fc) [^/(fc + 1) + (1 - ^)f(k)} 



fc=o 



n. ^ — * n -4- 



n ^— ' n + 1 " v ' v '— ' 

A;=0 

We finally look at degree evolutions (Z[m,n\: n ^ m) conditioned on both the existence and 
nonexistence of some edges. In this case we cannot prove stochastic domination and comparison 
requires a constant factor. 

Lemma 2.12. Suppose that (cn)ngNi ( n N)NeN ar ^ sequences of integers such that limjv-»oo n N = 
oo and c 2 N n^ is bounded from above. Then there exists a constant Qrw > 0, such that for all 
Zo,Zi disjoint subsets of {n^, . . . , N} with $Xo ^ cn and $Xi ^ 1 and, for any m G {1, . . . , N} 
with n Js m, we have 

F(AZ[m,n- 1] = l\AZ[m,i] = IV? GXi, AZ[m,i] = OVi el ) 

< Cfai2|P(AZ[m,w-11 = l| AZ[m,i] = IV* GXi). 

Proof. We have 

P(A2[m,n- 1] = l| AZ[m,i] = IVi GXi, AZ[m,«] = OVi G X ) 

P(A2[m, n - 1] = 1| A2[m, i] = IVi G Xi) 
^ P(A2[m, i] = (M G X | A2[m, i] = IVi 6 li) ' 

and it remains to bound the denominator from below by a positive constant. 
Using Lemma |2. 10 and denoting k = ifX\ we obtain that 



'(A2[m, i] = OVi G X | AZ[m, i] = IVi G X x ) 

> P x (AZ[m,i] = OVf G X ) ^ J\ ^ l {^Z[m,j] = 0) 



j-, r E^/^KiDl y 



jez 



ieio 



By Lemma 2.7 the expectation is bounded from above by f{k)p and moreover f(k) ^ k + 



1 ^ 2cjv for iV large enough. Hence we get, 



n {i - E[/(z[m - j|)1 } > n {i - w-'} > (i - 2 



jeZo ' J jex 



c N n N 7 



using that $Xo ^ cat. As c 2 N n^ l is bounded from above, the expression on the right is 
bounded from zero. This implies the statement. □ 
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2.3 Comparing the degree evolution and the pure birth process 

The aim of this section is to show that the processes (Z[m, n] : n ^ m) and (Zf. t ^ 0) are 
intimately related. To this end, we set 

n-i 1 1 

t n '■= y~] t and At n := t n+ i -t n = -. 

rv 71/ 

k=l 
Lemma 2.13. One can couple the random variables Z A t n and Z[n,n + 1] under F k such that 

F(Z Atn ^ Z[n,n+l}) ^ (f(k + l)At n ) 2 and (k + I) A Z Atn ^ Z[n,n + 1], almost surely. 
Proof. Note that 

"Arj„ 
t-n JO 

The same lower bound is valid for the probability F k (Z[n, n + 1] = k -+- 1). Moreover, 

P fc (^At n = k) = e~ f ^ Atn ^ (1 - /(fc)Atn) V = P fe (Z[n,n + 1] = k). 

Hence, we can couple Z A t n and Z{n,n + 1] under F k such that that they differ with probability 
less than 



» k (Z Atn = k + l) = f(k)At n e- f( - k)Atn — / e - A /( fc )« dn ^ /(£;)Ai n e~ /(fe + 1)At ". 

At n Jo 



(10) 



1 - [f(k)At n e-K k+1 ^ + 1 - /(fc)Ai n ] = f(k)At n (l - e~K k+ V At 

^(f(k + l)At n ) 2 , 

and moreover we have (k + 1) A ^At n ^ 2[n, n + 1]. □ 

Proposition 2.14. There exist constants no G N and C fem > suc/i £/ta£ /or a// integers 
uq ^ m ^ n and ^ k < m, 

\P m ,nf{k) - Pt n -t m f(k)\ < Cferu^ P m ,„/(A:). 

The proof of the proposition uses several preliminary results on the semigroups (Pj : t ^ 0) and 
(P m ,n'- n 5* "1)1 which we derive first. For a stochastic domination argument we introduce a 
further time inhomogeneous Markov process. For integers n, k ^ 0, we suppose that 

P fc (Z[n, n+l] = Jfc + l) = l- P fc (Z[n, n + 1] = Jfe) = (^ + l f{k) A/(0) e A/ (°^) A 1. 



s n 2 J n 2 , 

The corresponding semigroup is denoted by (P m ,n)m < n- 
Lemma 2.15. Assume that there exists no G N such that, for all integers n ^ no, 

/(n) +J/WA/(o)e^ia. (11) 

n 2 n z 

Then, for all integers n ^ no and ^ k ^ n, and an increasing concave g: {0, 1,2, . . .} — > M ; 

PAt n g(k) ^ P n ,n+ig(k). 
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Proof. Consider f(l) = f(k) + Af(k)(l — k). Note that by comparison with the linear model 

/(*) + Af(k)(E k [Z t ] -k)= E k [f(Z t )} ^ f(k)e^^ 1 . 
Hence, for t G [0, 1], using that e x ^ 1 + x + \ x 2 e x for x ^ 0, 

E k [Z t ] -k^ 4Mr(e Af{k)t - 1) ^ f(k) t + \f{k) Af(k) e Af{k)t t 2 . 



Therefore, E [Z&t n ] ^ E [-Z[n, n + 1]] for all n ^ uq. As g is increasing and concave and Z has 
only increments of size one, we get 

v k [g(ZAt n )] < g(k) + (</(* + 1) - 5 (fc))E fc [z Atn - k] 

< <?(&) + (</(* + 1) - <?(fc))E fc [Z[n, n + 1} - k] = E k [g(Z[n, n + 1])], 
as required to complete the proof. □ 

Lemma 2.16. There exists a constant QriEl > 0, depending on f , such that for all integers 
^ k ^ m and < m ^ n, we have 

Pm,nf(k) ^ QTmPm.nf(k). 

Proof. For n, m £ N with n ^ m let c m , n := n^mC 1 + p) where K : = 5(A/(0)) 2 e A/ (°). We 
prove by induction (over n — m) that for all < m ^ n and ^ k ^ m, 

Certainly the statement is true if n = m. Moreover, we have 

Pm,n+lf(k) = P m ,m+lPm+l,n+lf{k) + (Pm,m+1 ~ Pm,m+l)Pm+l,n+lf(k), 

and applying the induction hypothesis we get 

Pm,n+lJ\k) ^ C m +l,n+l-' m,n+lj (rCj + (Pm,m+l ~ P m,m+l)P m+l,n+lj {">)■ 

Moreover, for a function g : {0, 1,2,...}—)- K, we have 

(P m , m+1 - P m , m +i) 9(k) ^ i/(fc) A/(0) e Af M^Ag(k). (12) 

2 m z 

Note that the transition probabilities of the new inhomogeneous Markov process have a particular 
product structure: For all integers a ^ 1 and 6^0, one has 

¥\Z[a, a + 1] = b + 1) = (i> a ■ f(b)) A 1, for ^ a := 1 + 1 A/(0) e A /(°) £. 



This structure allows one to literally translate the proof of Lemma 2.8 and to obtain 



P ai ,a 2 f(b2) < /(&2) 
Pa u aJ(bl) " /(&l) : 
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for integers 01,02 ^ 1 and 61,62 ^ with ai ^ 02 and 61 ^ 62. Consequently, using (12) and 
the induction hypothesis, 



(Pm.m+1 " Pm, m +l)Pm + l,n+lf(k) < \f(k) A/(0) e A ^ ] Xl ^ " 



m 2 /(A;) 



Pm+l,n+lf(k) 



^ — o^ >rn + 1 ' n + 1 f^ ^ — 2 Cm + 1 >™+ 1 ^ >m + 1 > n + 1 ^(^)' 



(13) 



m 



/»■ 



Altogether, we get 



Pm,n+lf(k) ^ (l H 7:)c m+ i )n+ iP mjn+ if(k) — C m)n+ iP m)n+ if (k) , 



m^ 



and the statement follows since all constants are uniformly bounded by n£i(l + yl) < °°- d 

Proof of Proposition 2.14 , We choose no as in Lemma 2.15| and let k, m, n be integers with 
no ^ rn ^ n and ^ k ^ m. We represent K k [f(Z[m, n])] — K k [f(Zt n -t m )] as telescoping sum 



n-l 



P m ,n/(fc) - P tn -t m f(k) = J2 P m,l( P hl+l ~ Ptl + x-t l )Pt n -t l+1 f(k) 



(14) 



l=m 



In the following, we fix / E {m, . . . , n — 1} and analyse the summand £,. First note that by 
2 one has for arbitrary integers ^ a ^ 6, 

p(a,6) :=E fe [/(Z t „_ 4m )] -E a [/(Z 4n _ i!+1 )] < /(&) ~ f (a) E B [/(Z tn - f|+1 )]. (15) 



In the first part of the proof, we provide an upper bound for 

rp(a) := \(Pi,i+i - Pt l+1 - tl )Pt n -t l+ J(a)\, for ^ o < I. 



We couple Z^t, and i^|7,/ + 1] under P a as in Lemma 2.13 and denote by Z m and i? (2) the 
respective random variables. There are two possibilities for the coupling to fail: either Z w ^ a+ 
2 and i? (2) = a + 1, or Z w = a and Z (2) = a + 1. Consequently, 



-0(a) < F(Z W = a, Z {2) = a + 1) v?(a, a + 1) + E [l {zW > a+l} <p(a + 1,Z W )] . 
Since, by Taylor's formula, 

V{Z™ = a, Z^ = a + 1) = e~^ At > - (1 - /(a) At,) < * (/(«)^) 2 , 



(16) 



we get for the first term of (16), using (15) 



P(Z« = a, Z< 2 > = a + 1) p(a, a + 1) < J(/(a)At z ) 2 A/(a) 



E°[/(^-t !+1 )] 



/(a) """ -"~'"-''+i 



^/(a)(Ai^IE a [/(Z tn _ i;+1 )]. 



Now consider the second term in (16). We have 

E[%d) >a+ i } ^a + l,^ (1) )] < nZ^ = a + l) E a+1 [ip(a + l,Z Atl ] 

< /(a) At, 



(17) 



(18) 
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By Lemma |2j| we have E a+1 [f (Z Atl )] < /(a + 1) e A/ ( a+1 ) A<i , so that we conclude with JTH)) that 

E*+V(a + 1, Za*,)] < (e A ^ +1 ) Ai < - l)W^[f{Z tn - tl+1 )] < 2 At, E^ 1 [/(Z t „_ <i+1 )], 
where we used in the last step that A/(a + 1) < 1 and that e x ^ 1 + 2x for x £ [0, 1]. We 



combine this with estimates ( |16| ), (17), and (18), and get 

^a)^3f(a)(A tl ) 2 E a+1 [f(Z tn _ tl+1 )]. 

In the next step, we deduce an estimate for |S/| defined in ( [T4| ). One has 

|Sj| < Pm,lVW < 3At / E fc [A^/(Z[m,/])E 2 ^ +1 [/(Z tn _ ti+1 )]] 
= 3At, E fc [l {A z[ m ,l]=l } ^ Z[mH1] [f(Zt n -t l+1 )}] ■ 



By Lemma 2.10 we get 

|S,| < 3At,P fe (AZ[m,/] = l)E k+1 [E z ^ l+1 ^f(Z tn _ tl+1 )}] 

= 3(At l ) 2 E k [f(Z[mM^ k+1 [^ Zlm ' l+1] [f(Zt n -t l+1 )}] 
= 3(Ati) 2 P m ,if(k) P m ,l+iPt n -t l+1 f(k + 1). 



(19) 



We write P tn - tl+1 f(k + 1) = P tl+2 -t l+1 Pt n -t l+2 f (k + 1) and note that, by Lemma [2T3| P tn -t l+2 f 
is concave. Therefore, we get with Lemma 2.15 that Pt„-t !+1 /(fe + l) ^ Pi + \j + 2Pt n -ti +2 f{k + ]-)- 
Successive applications of this estimate and Lemma |2.16| yield 



P m ,l+lPt n -t l+ J(k + 1) < P m ,„/(fc + 1) < Ctm P m,nf(k + 1). 
Recall from Lemma 



(20) 



2.7 



that P m ,if(k) < {^Vf{k). Combining with (14), (19) and (20) yields 



72-1 

\Pm,nf(k) - Pt n -tJ\k)\ < 3 Cfeg/(fe) P m ,„/(fc + l)m^ £ r 2 +T 

m 

for a suitably defined constant Qrm depending only on 7 and /, as required. 



(21) 



□ 



3 Properties of the family (A a : < a < 1) of operators 

The objective of this section is to study the operators A a and relate them to the tree INT. We 
start with two lemmas on the functional analytic nature of the family (A a : a £ I). 

Lemma 3.1. 
(a) For any < a < 1 the following are equivalent 

(i) A* 1(0) < 00; 
(ii) A a g G C(«S) for all g€C{S). 

The set of a where these conditions hold is denoted by I. 
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(b) For any a £ I the operator A a is strongly positive. 

(c) For any a £ I the operator A a is compact. 

Proof. Recalling the Arzela-Ascoli theorem, the only nontrivial claim is that, if ^4 Q 1(0) < oo, 
then the family (A a g: \\g\\oo < 1) is equicontinuous. To this end recall that, for r ^ a ^ oo, by 
Remark 12.61 we have M T > M a and hence 



poo 

\A a9 (r) - A a g(a)\ ^ / e~ at d(M T - IVP)(t) . 
Jo 



Equicontinuity at oo follows from this by recalling the definition M°° = lim^oo M r . Elsewhere, 
for a < oo, we use the straightforward coupling of the processes (ZJ : t ^ 0) and (Z[ : t ^ 0) 
with the property that if Z^_ T = then Z[ = Zp} (T _ T . Hence we get, 

/*oo /*oo r /*oo 

/ e- at d{W - W){t) < (1 - e- a{a - T ^) / e- at dU T {t) + E / e- at dZ\ T] t{Z l J ] _ T > 0} 
Jo Jo ^Jo 

Since / °° e - at dM T {t) < E[/ °° e~ at dZ^\t)\ < 4*1(0) < oo, and F{Z l J ] _ T > 0} < ¥ 1 {Z ff _ r > 1} 1 
as a | t, both terms can be made small by making a — r small, proving the claim. □ 

Lemma 3.2. The function a t-t log p(A a ) is convex on I. 

Proof. By Theorem 2.5 of |Kat82j the function a i— > log p(A a ) is convex, if for each positive 
g € C(<S), e > and triplet Qi ^ ao ^ cti in X, there are finitely many positive gj G C(<S) and 
functions 0j : X — >• M, j € {1, . . . , m}, with log 0j convex, such that 

m 

A a k 9 ~ ^2 <A?( afc )#J ^ e for a11 fe G {0, 1, 2}. 

This criterion is easily checked using the explicit form of A Q , < a < 1. □ 



With the help of the following lemma, Theorem 1.1 follows from Theorem |1.7| The result is a 
variant of a standard result in the theory of branching random walks adapted to our purpose, 
see, e.g., Hardy and Harris [HH09] for a good account of the general theory. 

Lemma 3.3. The INT dies out almost surely if and only if there exists < a < 1 such that A a 
is a compact linear operator with spectral radius p(A Q ) ^ 1. 

Proof. Suppose that such an a exists. By the Krein-Rutman theorem (see, e.g., Theorem 
1.3 in Section 3.2 of |Pin95j) there exists a eigenvector v : S — > [0, oo) corresponding to the 
eigenvalue p(A a ). Our operator A a is strongly positive, i.e. for every g ^ which is positive 
somewhere, we have 

min A a g(r) > 0, 

res 

so that v is also bounded away from zero. Let Z-? {dt dx) be the empirical measure of types and 
positions of all the offspring in the nth generation of an IBRW started by a single particle of 
type r positioned at the origin. With every generation of particles in the IBRW we associate a 
score 

X n := [ Z ( T n) {dtdx)e- ax ^Pr. 

J V[T) 
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The assumption p(A a ) ^ 1 implies that (X n : n G N) is a supermartingale and thus almost 
surely convergent. Now fix some N > 1, an integer n ^ 2 and the state at generation n — 1. 
Suppose there is a particle with location x < N in the (n — l)st generation. Then there is a 
positive probability (depending on N but not on n) that X n — X n _i > 1 and, as (X n : n G N) 
converges, this can only happen for finitely many n. Hence the location of the leftmost particle 
in the IBRW diverges to +oo almost surely. This implies that the INT dies out almost surely. 

Conversely, we assume that X is nonempty and fix a G X. The Krein-Rutman theorem gives 
the existence of an eigenvector of the dual operator, which is a positive, finite measure v on the 
type space S such that J v (t) v{dt) = 1 and, for all continuous, bounded /: S — Y K, 

A a f{t)u{dt)=p{A a ) I f{t)v{dt). 

Because A a is a strongly positive operator, the Krein-Rutman theorem implies that there exists 
Ao < p(A a ) such that |A| ^ Ao for all A G a(A a )\{p(A a )} , where a(A a ) denotes the spectrum of 
the operator. Hence p{A a ) is separated from the rest of the spectrum and by Theorem IV. 3. 16 
in [Kat76] this holds for all parameters in a small neighbourhood of a. Hence, arguing as in 
Note 3 on Chapter II in [Kat76, pp. 568-569], the mapping a i-> p{A a ) is differentiable and its 
derivative equals 

p'{A a ) := — / A a v(t) v{dt) = I -T-A a v(t) u(dt), (22) 



da J J da 

where the second equality can be inferred from the minimax characterisation of eigenvalues, see 
e.g. Theorem 1 in |Ram83j . Given r G S we define a martingale by 



WP = p(A Q y n J! ^-e- ax Z^(dtdx), 



and argue as in Theorem 1 of [KRS01] that it converges almost surely to a strictly positive 
limit W T if 

log p(A a ) - ap [ A ^ > o and sup E \W^ log W^] < oo. (23) 

P(A a ) tGS 

Let us assume for the moment that the second condition holds true for all a G I. Then, if a 
is such that the limit W T exists and is positive, it also exists for the offspring of any particle of 
type r in position x, and we denote it by W T (x). By decomposing the population in the mth 
generation according to their ancestor in the nth generation, and then letting m — > oo, we get 



W T = p(A a )~ n f^-e- ax W t (x)Z^(dtdx). 



Denoting by P T the law of the IBRW started with a particle at the origin of type r, we now look 
at the IBRW under the changed measure 

dQ= I ' v(dT)v{r)W T dP T . 

Given a sample IBRW we build a measure pi on the set of all infinite sequences 

((x ,t ),(x 1 ,t 1 ),...), 
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where Xj is the location and tj the type of a particle in the jth generation, which is a child 
of a particle in position Xj—\ of type £j_i, for all j Js 1. This measure is determined by the 
requirement that, for any permissible sequence 



v{((yo,so), (yi,si),...) : y = x ,s = to 

v(t n ) 



; Vn — %n i ^n — tn / 



= p(A*) n — T exp{-a:E n } . 

Looking unconditionally at the random sequence of particle types thus generated, we note that 
it is a stationary Markov chain on S with invariant distribution v (t) v(dt) and transition kernel 
given by 



P t0 (£) = p(A a ) 



-i v(£) 



v(t 



o) Jo 



-at 



dM t0 (t) 



P to (dt) = p{A a )- 1 ^P- e at dU{t) for t ^ 0. 
v{to) 



Using first Birkhoff's ergodic theorem and then (22) we see that, Q-almost surely, //-almost 
every path has speed 



lim — 



1 



p(A a ) 

1 



E 



Z to (dt dx) xe 



v{t ) 



d A a v(t c 
p(A a ) J da v(t ) 

Suppose that an £ 1 is such that 



v(t ) u(dt ) 



v(t ) v(dto) 

p'(A a ) = d 
p(A a ) da 



logp(A*)- 



p(A ao ) = mm p(A a ) > 1. 



From Lemma 3.2 we can infer that there exists a > an such that the first condition in (23) holds 
and 

-— log p(A a ) <0. 
da 

This implies that, Q-almost surely, there exists an ancestral line of particles diverging to — oo. For 

the IBRW started with a particle at the origin of type £ we therefore have a positive probability 

that an ancestral line goes to — oo. This implies that the INT has a positive probability of 

survival. 



To ensure that the second condition in (23) holds we can use a cut-off procedure, and replace the 



offspring distribution Z w (dt dx) by one that takes only the first N children to the right and left 
into account. It is easy to see that, for fixed < a < 1 and sufficiently large N, we can ensure 
that the modified operator A& is close to the original one in the operator norm, and as large as 
we wish if the original operator is ill-defined. Hence the continuity of the spectral radius in the 
operator norm ensures that limjv-!.oo p(Aa ) = p(A a ), with the spectral radius of an ill-defined 



operator being infinity. Using Lemma 3.2 and the fact that a sequence of convex functions, 
which converges pointwise, converges uniformly on every closed set, we can choose N so that 



for all < a < 1 the modified operators satisfy p(A a ) > 1, while the cut-off ensures that the 



second criterion in (23) automatically holds. The argument above can now be applied and yields 
the existence of an ancestral line of particles diverging to — oo, which then automatically also 
exists in the original IBRW. □ 
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Our proofs, in particular the crucial sprinkling technique, relies on the following continuity 
property of the survival probability of the INT. 



Lemma 3.4. One has 



limp(f-e) =p(f). 



Proof. We only need to consider the case where p(f) > 0, as otherwise both sides of the 
equation are zero. We denote by p(a, /) the spectral radius of the operator A a formed with 
respect to the attachment function /, setting it equal to infinity if the operator is ill-defined. 



The assumption p(f) > implies, by Lemma 3.3 that for all < a < 1 we have p(a, f) > 1. As 



the operator norm \\A a \\ for the operator formed with respect to the attachment function / — e 
depends continuously on e ^ 0, we can use the continuous dependence of the spectral radius on 
the operator norm to obtain, for fixed a, 

limp(a,f -e) = p(a,f). 
£4,0 

As a sequence of convex functions, which converges pointwise, converges uniformly on every 
closed set, we find e > such that p(A a , f — e) > 1 for all < a < 1. Thus, using again 



Lemma 3.3 we have p(f — e) > 0. 

Now we look at the IBRW started with one particle of type £ in position t, constructed using the 
attachment rule / — e, such that any particle with position > is killed along with its offspring. 
We denote by E(s, t) the event this process survives forever, and by V(e, t, k) the probability 
that a particle reaches a site < k. Then we have 

lim mfF( E(e,t)) = 1. 

K— > — OOt<K 

For fixed k < and ^ e ^ £q we have 

F(E(e, tj) > P{V(e, t, k)) F(E(e , «)) ^ P(V(0, t, «)) F(E(e , «)) . 

Note that the first probability on the right is greater or equal to p(f) and that the second 
probability tends to one, as k tends to — oo. □ 



4 The giant component 

This section provides two crucial tools: A tool to obtain global results from our local approxima- 



tions of neighbourhoods given by the 'sprinkling' argument in Proposition 4.1, and an a priori 
lower bound on the size of the connected components of the oldest vertices in the system given 
in Lemma |4.2| We follow the convention that a sequence of events depending on the index iV 



holds with high probability if the probability of these events goes to one as iV f oo. 

Proposition 4.1 (Sprinkling argument). Let e G (0,/(0)), k > 0, and f(k) = f(k) — e for 
integers k Js 0. Suppose that (cn)ng'N is a sequence of integers with 

c 2 
lim Wkecm — log N) = oo and lim — = 0, 

iVtoo 12 J iV^-oo N 
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and that, for the preferential attachment graphs (Gn)ngn with attachment rule f, we have 

N 

\ 1{\Cn(v)\ ^ 2cjv} sS kN with high probability, 

v=l 

where Cn(v) denotes the connected component of the vertex v in Gn- Then there exists a coupling 
of (Gn) with (Gn) such that Gn ^ Gn and all connected components of Gn with at least 2cn 
vertices belong to one connected component in Gn with at least kN vertices, with high probability. 

Proof. Note that we can couple Gn and an independent Erdos-Renyi graph G% R with edge 
probability e/N with Gn such that 

Gn<GnVGn R ^Gn- (24) 

Here, Gn V Q N R denotes the graph in which all edges are open that are open in at least one of 
the two graphs, and G' ^ G" means that all edges that are open in G' are also open in G" ■ We 
denote by Vj^ the vertices in Gn that belong to components of size at least 2cn and write V^ 
as the disjoint union C\ U • • • U Cm , where C\, . . . , Cm are sets of vertices such that, 

• \Cj\ £ [cn,2cn] and 

• Cj belongs to one component in Gn, for each j = 1, . . . , M. 

'■•'■■"'■' ' l! "-'■" r •'■■•■■ '"'■'■' 1 -' ■ ~ u 'j N 



Recall (24), and note that given Gn and the sets C±, . . . , Cm, the Erdos-Renyi graph G^ R connects 



two distinct sets Cj and Cj with probability at least 



e „2 E 



By identifying the individual sets as one vertex and interpreting the ^^-connections as edges, 
we obtain a new random graph. Certainly, this dominates an Erdos-Renyi graph with M vertices 
and success probability p^, which has edge intensity Mp^. By assumption, ^ ~ ^ M ^ N with 
high probability. Hence M — > oo and Mp^ — logM — > oo in probability as N t oo. By [Hof09, 
Thm. 5.6], the new Erdos-Renyi graph is connected with high probability. Hence, all vertices of 
V' N belong to one connected component in Gn, with high probability. □ 

We need an 'a priori' argument asserting that the connected components of the old vertices are 
large with high probability. This will in particular ensure that the connected component of any 
vertex connected to an old vertex is large. 

Lemma 4.2 (A priori estimate). Let (cn)n£N and (n^N^ be sequences of positive integers 

such that 

r CN n A v lo § n ^ n 
nm = U and lim = L). 

N^oo log iV log log ,/V 7V->oo log iV 

Denote by Cn(v) C Gn the connected component containing v € {1, ... , N}. Then 

P(#C N (v) < c N for any v £ {I, ... , n N }) — > 0. 
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Proof. We only need to show this for the case when / is constant, say equal to /3 > 0, as all 
other cases stochastically dominate this one. Note that in this case all edge probabilities are 
independent. We first fix a vertex v £ {1, . . . , n^r} and denote by Z\ = Z\(v) the number of its 
direct neighbours in (tin, N/logN]. We obtain, for any A > 0, 



[N/logN\-l 

-Mi IT " f £„-a _i_ h - £ 



and hence, for sufficiently large iV, 

|Ar/logiVJ-l 

logEe~ AZl < - /3 (1 - e- x ) Yl ~ < -|/3(l-e" A )logiV. 

By the exponential Chebyshev inequality we thus get for sufficiently large JV, 

P(Zi < § logJV) < 7V A f-f (1-«~ A ) < TV- &, (25) 

choosing A = g in the last step. Now let Z2 = Z 2 (v) be the number of direct neighbours in 
(A r / log N, N] of any of the Z\{y) vertices who are direct neighbours of v in (njv, iV/ log A 7 ]. We 
obtain, for any A > 0, 

N-l 

E[e- AZ2 |^i] = II ( X + ^ " i)! 1 " (! " f ) Zl ))> 



and hence, for sufficiently large N, on the event {Z\ ^ § log AT}, 



j=[iV/logiV] 



JV-1 

1 . ,„ ^1, fl2 



logE[e- AZ2 |Zi] ^ -(l-e" A )f Zi J^ - < - (1 - e" A ) ^ log ATloglogAT. 

j=LAf/logAfJ J 



By (25) and the exponential Chebyshev inequality (with A = 1) we thus get for sufficiently 
large N, 

F(Z 2 (v) < c N ) < P(Zi < § logN)+F(Z 2 {v) < c N \ Z x > f log AT) 

<^ TV^m +iV -ir lo s lo s 7V + c ^/ lo s Ar . 

Let A = J. By our assumptions on (cn)n&n and (n^v^eM the sum of the right hand sides over 
all v G {1, .. . , ra^r} goes to zero, ensuring that #Cn(v) ^ Z 2 (v) ^ cat for all v G {1, . . . , ^at} 
with high probability. □ 

5 The exploration process 

Our aim is to 'couple' certain aspects of the network to an easier object, namely a random 
tree. To each of these objects we associate a dynamic process called the exploration process. In 
general, an exploration process of a graph successively collects information about the connected 
component of a fixed vertex by following edges emanating from already discovered vertices in 
a well-defined order, so that at each instance the explored part of the graph is a connected 
subgraph of the cluster. We show that the exploration processes of the network and the labelled 
tree can be defined on the same probability space in such a way that up to a stopping time, 
which is typically large, the explored part of the network and the tree coincide. 
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5.1 A random labelled tree 

We now describe a tree T(w) which informally describes the neighbourhood of a vertex w 6 Qn- 
Any vertex in the tree is labelled by two parameters: its location, an element of {1, . . . , AT}, 
and its type, an element of {£} U {1, . . . , A}. The root is given as a vertex with location w 
and type £. A vertex v with location i and type £ produces independently descendants in the 
locations 1, . . . , i — 1 (i.e. to its left) of type i with probability 

F(v has a descendant in j of type i) = P(AZ[j, i — 1] = 1). 

Moreover, independently it produces descendants to its right, which are all of type £, in such 
a way that the cumulative sum of these descendants is distributed according to the law of 
{Z\i,j\: i + 1 ^ j ^ n). A vertex v of type k produces descendants to the left in the same 
way as a vertex of type £, and independently it produces descendants to the right, which are 
all of type £, in such a way that the cumulative sum of these descendants is distributed as 
(Z[i,j] — l[fc,oo)(J) : * + 1 ^ j ^ ri) conditioned on AZ[i, k — 1] = 1. 

Observe that, given the tree and the locations of the vertices, we may reconstruct the types of 
the vertices in a deterministic way: any vertex whose parent is located to its left has the type £, 
otherwise the type of the vertex is the location of the parent. 

The link between this labelled tree and our network is given in the following proposition, which 
will be proved in Section |5.3[ 



Proposition 5.1. Suppose that (c/v)at£N is a sequence of integers with 

r CN n 

lim — — = 0. 

N^-oo log A log log A 
Then one can couple (V, Qn) and T(V) such that with high probability 

#C N (V)Vc N = #T(V)Vc N . 

5.2 Exploration of the network 

We now specify how we explore a graph like our network or the tree described above, i.e., we 
specify the way we collect information about the connected component, or cluster, of a particular 
vertex v. In the first step, we explore all immediate neighbours of v in the graph. To explain a 
general exploration step we classify the vertices in three categories: 

• veiled vertices: vertices for which we have not yet found connections to the cluster of v; 

• active vertices: vertices for which we already know that they belong to the cluster, but for 
which we have not yet explored all its immediate neighbours; 

• dead vertices: vertices which belong to the cluster and for which all immediate neighbours 
have been explored. 

After the first exploration step the vertex v is marked as dead, its immediate neighbours as 
active and all the remaining vertices as veiled. In a general exploration step, we choose the 
leftmost active vertex, set its state to dead, and explore its immediate neighbours. The newly 
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found veiled vertices are marked as active, and we proceed with another exploration step until 
there are no active vertices left. 

In the following, we couple the exploration processes of the network and the random labelled tree 
started with a particle at position v and type £ up to a stopping time T. Before we introduce 
the coupling explicitly, let us quote adverse events which stop the coupling. Whenever the 
exploration process of the network revisits an active vertex we have found a circle in the network. 
We call this event (El) and stop the exploration so that, before time T, the explored part of 
the neighbourhood of v is a tree with each node having a unique location. Additionally, we stop 
once the explored part of the network differs from the explored part of the random labelled tree, 



calling this event (E2), we shall see in Section 5.3 how this can happen. In cases (El) and (E2) 
we say that the coupling fails. 

Further reasons to stop the exploration are, for certain parameters 1 ^ njy, cn ^ N, 

(A) the number of dead and active vertices exceeds c/v, 

(B) one vertex in {1, . . . , n^} is activated, and 

(C) there are no more active vertices left. 

If we stop the exploration without (El) and (E2) being the case, we say that the coupling 
succeeds. Once the exploration has stopped, the veiled parts of the random tree and the network 
may be generated independently of each other with the appropriate probabilities. Hence, if we 
succeed in coupling the explorations, we have coupled the random labelled tree and the network. 

5.3 Coupling the explorations 

To distinguish both exploration processes, we use the term descendant for a child in the la- 
belled random tree and the term immediate neighbour in the context of the neighbourhood 
exploration in the network. In the initial step, we explore all immediate neighbours of v and 
all the descendants of the root. Both explorations are identically distributed and they therefore 
can be perfectly coupled. Suppose now that we have performed k steps and that we have not 
yet stopped the exploration. In particular, this means that both explored subgraphs coincide 
and that any unveiled (i.e. active or dead) element of the labelled random tree can be uniquely 
referred to by its location. We now explore the descendants and immediate neighbours of the 
leftmost active vertex, say n. 

First, we explore the descendants to the left (veiled and dead) and immediately check whether 
they themselves have right descendants in the set of dead vertices. If we discover no dead 
descendants, the set of newly found left descendants is identically distributed to the immediate 
left neighbours in the network. Thus we can couple both explorations such that they agree in 
this case. Otherwise we stop the exploration due to (E2). 

Second, we explore the descendants to the right. If the vertex n is not of type £, then we know 
already that n has no right descendants that were marked as dead as n itself was discovered. 
Since we always explore the leftmost active vertex there are no new dead vertices to the right 
of n. Therefore, the explorations to the right in the network and the random labelled tree are 
identically distributed and we stop if we find right neighbours in the set of active vertices due 
to (El). If the vertex n is of type £, then we have not gained any information about its right 
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descendants yet. If we find no right descendants in the set of dead vertices, it is identically 
distributed to the immediate right neighbours of n in the network. We stop if right descendants 
are discovered that were marked as dead, corresponding to (E2), or if right descendants are 
discovered in the set of active vertices, corresponding to (El). 

Lemma 5.2. Suppose that (cn)n^n, (waOtvsN are sequences of integers such that 

lim -4£_ = 0. 

^->°° n N 1 

Then the coupling of the exploration processes satisfies 

lim sup ¥ {coupling with initial vertex v ends in (El) or (E2)) = 0, 

N ^°° v£{n N +l,...,N} 

i.e. the coupling succeeds with high probability. 

Proof. We analyse one exploration step in detail. Let o and t) denote the active and dead 

vertices of a feasible configuration at the beginning of an exploration step, that is a, c) denote 

two disjoint subsets of {n^ + 1, . . . , N} with #(o U U) < cn and a ^ 0. 

The exploration of the minimal vertex n in the set a may only fail for one of the following 

reasons: 

(la) the vertex n has left descendants in D, 

(lb) the vertex n has left descendants which themselves have right descendants in t), or 

(II) the vertex n has right descendants in aUD. 

Indeed, if (la) and (lb) do not occur then the exploration to the left ends neither in state (El) 
nor (E2), and if (II) does not happen the exploration to the right does not fail. 

Conditionally on the configuration (a, D), the probability for the event (la) is 

a G t) such that AZ[a, n - 1] = 1) < ^ F(AZ[a, n - 1] = 1), 

a<n 



whereas the probability for (lb) is by Lemma 2.10 



F(3a G D c and b G d such that AZ[a, n - 1] = AZ[a, b - 1] 
^ Yl Yl p ( AZ K n - 1] = AZ[a, b - 1] = 1) 



a<n b>a 



^ Y Y F ( Az i a i n - 1] = 1) F 1 (AZ[a, b - 1] = 1). 



aeo c fees 

a<n b>a 



If the vertex n is of type r/f, then the probability of (II) is 

F(3a G o such that AZ[n,a- 1] = l\Z[n,r -1} = l,AZ[n,b- 1] = \/b G Q\{t}) 
^Cfcm J2 V 1 (AZln,a-l] = l), 



aeaUO 
a>n 
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using first Lemma 2.12 and then Lemma 2,10[ 

If the vertex n is of type £, the probability of (II) is 

F(3a G a U d such that AZ[n, o - 1] = 1) < V P(A2[n, a - 1] = 1). 

aeaUO 
a>n 



Since, by Lemma 2.11[ for any a > n, 

P 1 (AZ[n, a - 1] = 1) < P 1 (AZ[n N + l,n N + l] = l), 
we conclude that the probabilities of the events (la) and (II) are bounded by 

(2 + Cfcm)c N F 1 (AZ\n N + l, n N + l] = l), 

independently of the type r. Moreover, the probability of (lb) is bounded by 

n-l 
c N ¥ 1 (AZ[l,n N ] =l)J^P(AZ[o,n-l] = 1). 

o=l 



The sum is the expected outdegree of vertex n, which, by Lemma 2.7, is uniformly bounded 
and, hence, one of the events (la), (lb), or (II) occurs in one step with probability less than a 
constant multiple of CNF 1 (AZ[l,ri]\[] = 1). As there are at most cn exploration steps until we 
end in one of the states (A), (B), or (C), the coupling fails due to (El) or (E2) with a probability 
bounded from above by a constant multiple of 



c 2 N ¥\AZ[l 1 n N ] = l)^qn\f{l) 



-N 



n N 



1-7 



0, 



in other words, the coupling succeeds with high probability. 



□ 



Proof of Proposition 5.1 , Apply the coupling of Lemma 5.2 with (re^r 



)NeN 



satisfying 



lim 



lognAr 



iV^oo log A 



and lim (^glog Nf = Q _ 



7V->oo 



n 



1-7 

N 



Then, by Lemma 4.2, we get that with high probability 

coupling ends in (B) =^- #Cn(V) > cjy- 
As in the proof of Lemma 



(26) 



4.2 



coupling ends in (A) or (B) ^=^> 
and the statement follows immediately. 



one gets limjv^oo max w= i v .. jnjv P(#T(w) < cat) = so that impli- 

ids we h 

with high probability, 



cation (26) is also valid for #T(V). Since the coupling succeeds we have 

=► #C N (V)A#T(y)>c Nl 



□ 
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6 The idealized exploration process 

We have seen so far that the neighbourhood of a vertex v in a large network is similar to the 
random tree T(v) constructed in Section 5.1. It is our aim in the present section to clarify the 
relationship between T(V), for an initial vertex V chosen uniformly from {1, . . . ,N}, and the 
idealized neighbourhood tree X featuring in our main theorems. Our main aim is to prove the 
following result. 

Proposition 6.1. Suppose that (cjv)jvsn is a sequence of integers with 

lim ° N = 

N^too log N log log N 

Then each pair (V, Qn) can be coupled with X such that with high probability 

#C N (V)Vc N = #1Vc N . 

The basic idea is to introduce a projection 

tt n : (-oo,0]-*{l,...,JV}, 

which maps t ^ onto the smallest m G {1, . . . , N} with t ^ — tjy + t m . Applying -kn to each 
element of the INT X we obtain a branching process with location parameters in {1, . . . , N}, 
which we call 7T7v-projected INT. We need to show, using a suitable coupling, that when the INT 
is started with a vertex —X, where X is standard exponentially distributed, then this projection 
is close to the random tree T(V). Again we apply the concept of an exploration process. 

To this end we show that, for every v ^ 0, the 7Tjv-projected descendants of v have a similar 



distribution as the descendants of a vertex in location itn(v) in the labelled tree of Section 5.1 
We provide couplings of both distributions and control the probability of them to fail. 

Coupling the evolution to the right for £-type vertices 

We fix v ^ and N G N, and suppose that m := kn{ v ) ^ 2. For an £-type vertex in v 
the cumulative sum of 7T7v-projected right descendants is distributed as (Z tn -t N - v )m < n < N- 
This distribution has to be compared with the distribution of (2[m,n]) m <g n ^ tv ; which is the 
cumulative sum of right descendants of m in T(v). 

Lemma 6.2. Fix a level T G N. For any v ^ with ttn(v) = m G {2, 3, . . . , N} we can couple 
the processes {Z tn -t N -v'- n ^ m) and (Z{m,n\: n ^ m) such that for the coupled processes 
(iJ (1) [ra] : n ^ m) and (i? (2) [n] : n ^ m) we have 

F(Z w [n] / Z (2) [n] for some n ^ r) ^ (/(0) + /(T) 2 ) — — , 
v m — 1 

where r is the first time when one of the processes reaches or exceeds T. 

Proof. We define the process Z = ((Z^[n],Z^[n\) : n ^ m) as Markov process with starting 
distribution C(Z tm -t N -v) ®b~o and transition kernels p (n) such that the first and second marginal 
are the respective transition probabilities of (Zt n -t N - v '■ n ^ fn) and (Z[m, n]: n ^ m) and, for 
any integer a ^ 0, the law p (n) ((a,o), •) is the coupling of the laws of Z^t„ an d Z[n,n + 1] 
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under P a provided in Lemma 2.13 Then the processes (i? (1) [n] : n Js m) and (i? (2) [n] : n ^ m) 
are distributed as stated in the lemma. Moreover, letting a denote the first time when they 
disagree, we get 



(a < r) = Y^ P(t > n, a = n) sC ¥(a = m) + ^ P(<? = n + l|r>n,cr> 



■/?■ 



and, by Lemma 2.13 



/ 1% 2 

Put = n + 1| r > n, cr > n) ^ ( f(T)— I for n G {m,m + 1, . . . }. 

Moreover, P(cr = m) = F(Z w [m] > 0) = 1 - e^*™^)^ ) ^ ^1. Consequently, 

P(a < r) < ^\ + /(T) 2 f) -L < (/(0) + /(T) 2 ) -1-. n 

to — 1 *—* n z to — 1 

n=m 

Coupling the evolution to the left 

Recall that a vertex v ^ produces a Poissonian number of 7TAr-projected descendants at the 
location m ^ n := tvn(v) with parameter 

e-^E[f(Z v _ u )]du. (27) 

-tjV+t m _i 

Here we adopt the convention that to = — oo. A vertex in location n in T[v] produces a Bernoulli 
distributed number of descendants in m with success probability ¥(AZ[m, n — l] = l) for to < n 
and success probability zero for m = n. The following lemma provides a coupling of both 
distributions. 

Lemma 6.3. There exists a constant Cjo] > such that the following holds: Let m, N G N 



and v Sj with to ^ n := vtjv(w) and define A as in (21). If m < n, one can couple a Poiss(A) 
distributed random variable with AZ[m,n — 1], such that the coupled random variables i? (1) and 
Z (2) satisfy 

Additionally, ifm = n^2, a Poiss(A) distributed random variable Z w satisfies 

1 



P(Z (1) / 0) ^ 



/? 



Proof. First consider the case where m = n Js 2. Note that ai->e U E[/(Z U )] is decreasing so 
that 

A^ f e-^ElfiZ^du^fiO)^, 

J — tjV+in-l 

which leads directly to the second statement of the lemma. Next, consider the case where 
2 ^ to < n. Note that for u G (— ijv + im-ij — *at + £m]> one has v — u G (£ n -l — tm,tn — tm-i) 
which, using again that u h-> e _ "E[/(Z u )] is decreasing, implies that 

5 L ie -(t„-t ro -0E[/(Z tij . tm _ 1 )] < A < ^e-C"-^- *™)E[/(Z t „_ 1 _ tm )]. 
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Next, note that log ^ < t n — t m < log ^4 so that 



(! " Sib) ^1 E [/(^-i-* m )] < A < (1 + ^) jjij E[/(Z t „_ 1 _ tm )]. (28) 

On the other hand, AiJ[m, n — 1] is a Bernoulli random variable with success probability 

p:=^E[f(Z[m,n-l])}. 



By Lemma A.l it suffices to control A 2 and |A —p\. By Proposition 2.14 and (28), 



and 



Since t. 



I A - P| < C -^ -^ (E^Z^tJ] + E[/(Z[ 



m, n 



1])]), 



n— 1 



t m ^ log ^iij2 1 we get with Lemma 



A 2 < 4(^ T )^E[/(Z tn _ 1 _ tm )] 2 . 

and Lemma 



(29) 



(30) 



2.1 



2.7 



that 



E[/(^-i-tJ] + nf(Z[m, n - 1])] < C (-V . 
Recalling that n > m ^ 2, it is now straightforward to deduce the statement from equations 



(29) and (30). It remains to consider the case where 1 = m < n. Here, we apply Lemma 2.1 



and i n _i ^ log(n — 1) to deduce that 



X 



/-tjv+ti /-oo /-i 

-oo Jtn—1 ' 



(n-1) 



7-1 



while, by Lemma [2?t] F(AZ[l,n - 1] = 1) < /(0) (n - l) 7 " 1 , so that a Poiss(A) distributed 
random variable can be coupled with AZ[l,n — 1] so that they disagree with probability less 
than a constant multiple of n 7_1 . □ 



Remark 6.4. Lemma 6.3 provides a coupling for the mechanisms with which both trees produce 
left descendants. Since the number of descendants in individual locations form an independent 
sequence of random variables, we can apply the coupling of the lemma sequentially for each 
location and obtain a coupling of the 7Tjv-projected left descendants of a vertex v and the left 



descendants of n := itn(v) in T[v]. Indeed, under the assumptions of Lemma 6.3 one finds a 
coupling of both processes such that, if n ^ 2, 



n-l 



P(families of left descendants disagree) ^ Cfei — \- Cfo — — / — ti~ ^ QE3i~r 

Tl T?. i— 7 £■ — ' m. i -+l n l ~ 

where Qy]is a suitable positive constant. 



■ n ,-1 A^ m l+7 
m=l 
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Coupling the evolution to the right for particles of type r/f 

We fix v ^ and JVeN, and suppose that m := ttn(v) Js 2. Also fix a type r < —v with 
/ := ttn(v + t) > m. The cumulative sum of 7T/\r-projected right descendants of a vertex v 
of type r (including its predecessor) is distributed according to {Z^t N +t n -v : m ^ n ^ N) 
conditioned on AZ T = 1. The cumulative sum of right descendants in T[v] of a vertex in m of 
type I (including the predecessor) is distributed according to the law of (Z[m, n]: m ^ n ^ N) 
conditioned on AZ[m, 1 — 1} = 1. Both processes are Markov processes and we provide a coupling 
of their transition probabilities. 

Lemma 6.5. There exists a constant Qng > such that the following holds: Let k Js 0, m,n Js 1 

be integers with k + 1 < m < n, and let t E (t n — t m ,t n+ i — t m ]. Then the random variables 
Z/\t m under P ( • | AZ T = 1) and Z[m, m + 1] under P ( • | AZ[m, n] = 1) can 6e coupled such 
that the resulting random variables Z w and JJ (2) satisfy 

F(Z^^Z^)^CtB(—Y- 
— V m / 

Proof. As i?[m, m + 1] G {&,£; + 1} there exists a coupling such that 

F(Z W ± Z {2) ) = |P(£ (1) = jfe) - P(Z (2) = k)\ + P(£ (1) > fc + 2). 
The second error term is of the required order since, by Lemma |2.5[ 



W > jfe + 2) < P fc+1 (Z 1/m ^ fc + 3) < ( /(fc + 2) ) 2 . 



l/r,. 



It remains to analyse the first error term. We have 

p(Z (2) = k) = 1 - /(ib)At 



(2) _ ,., _ , ,-,,., x, P m+l,n/(A: + 1) 



-^m,nj\k) 
and, representing (ZJ : t ^ 0) by its compensator, 



p(z ™^"^{- /w rw4 



We need to compare 



Pm+l,nf{k + 1) PJ(fc + 1) 

P «U alld p j/,n — for U € [tn - tm+l, Wl - tm\- 



By Lemma 2.1 and Proposition 2.14, one has, for a E {k, k + 1} and sufficiently large m, 
Puf{a) < P tn+1 - tm /(a) < e 7( - + " ) J P i „- W i/(«) < e^+^l + Cfcg^?) Pm+i,n/(a) 

Conversely, 

PJ(a) ^ P tn - Wl /(a) ^ e-^P t „_ tm /(a) > e ~^(l - Cfenf^) P m ,„/(a). 
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We only need to consider large m and we may assume that Cfcrz y ^ o' as otherwise we 

may choose C fe^] large to ensure that the right hand side in the display of the lemma exceeds 
one. Then 

Puf(a) > e-^- 2C &^P m;n f{a), 

since e~ 2y ^ 1 — y for y G [0, 1/2]. Consequently, 

^- 7 (2i + l)-3C^TH^^ P m +l,nf{k + 1) P u /(fc + 1) ^ (2 ^ + l )+3qTTI1 /(W) P m +l,nf{k + 1) 

g ' v m n 7 ^ I m <^ <^ 6 m n J ^ I m . 



Recall that, by Lemma 



2.2 



m p 1 ' n }r k \ is uniformly bounded over all k so that we arrive at 



Pm + l,nf(k + 1) c f(k) P u f(k + 1) P m+ i, w /(fc + 1) + c /(fc) 

P m ,nf(k) m P u f{k) Pm,nf{k) m 

for an appropriate constant C > 0. Therefore, 

P(Z (1) = fc) - P(Z (2) = k) 

< 1 A exp{-/(fc)At m ( ^%+ 1 ) - C ^)} - (1 - /(fe)At m ^%^ ) 



^— ,^*r 



^(^) 2 + ,(/(^4^-t^-^^)) 2 ^^(^)^ 



Similarly, one finds that 

F(Z^ = k)- P(Z« = k) < Cfe3](4r) 2 , 
and putting everything together yields the assertion. □ 



From Lemma 6.5 we get the following analogue of Lemma 6.2 



Lemma 6.6. Fix a level T G N. For any v ^ and r < —v with ttn(v) = m £ {2, 3, . . . , iV} 
an<i m < I := ttn(v + r) we can couple the processes (Z tn -t N - v '- n ^ m) conditioned on 
AZ T = 1 and (Z[m,n\: n Js m) conditioned on AZ[m,l — 1] = 1 such that the coupled pro- 
cesses (i? (1) [n] : n ^ m) and (i? (2) [n] : n Js m) satisfy 

¥(Z m [n] / Z (2) [n] /or some n < a) < Cfeg](/(T) 2 + 1) — , 

where a is the first time when one of the processes reaches or exceeds T. 

Proof. We define the process Z = ((2 (1) [rz],2 (2) [n]) : n Js m) as Markov process with starting 
distribution C(Z tm -t N - v \ AZ T = l)(S>o"o and transition kernels p (n) such that the first and second 
marginal are the conditioned transition probabilities of {Zt n ~t N -v ■ n ^ fn) and (Z[m, n] : n Js m) 
as stated in the lemma. In the case where n < I — 1, we demand that, for any integer a > 0, 
the law p (n) ((a, a), ■ ) is the coupling of the laws of Z& tn under P a ( • | AZ r _( tn _ tjv _„) = 1) and 



Z[n,n + 1] under P a ( • | AZ[n,l — 1] = 1) provided in Lemma 6.5 Conversely, we apply the 



unconditioned coupling of Lemma 6.2 for n > I. Letting q denote the first time when both 
evolutions disagree, we get 

oo oo 

F(q < a) = Y^ F ( a > n > Q = n ) < HQ = m ) + Yl F ( e = n + X l a > n ' e > n ) 
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and, by Lemma 6.2 and Lemma 6.5 

F(g = n + 1| a > n, g> n) < %5H 



f(T) 



n 



for n £ {m, m + 1, . . . }\{l — 1}. 



Moreover, P(g = m) < P 1 (Z tm _ t]V _ 1 , > 0) = 1 - e^*"*-^ 1 ) < ^ and P(g = J|g > I, a > I) < 
P T (^At J _ 1 > T) < /(T)i. Consequently, 



(Q^a)^ 



/(I) , /en 



m — 1 



m 



oo 



□ 



We are now in the position to complete the proof of Proposition 6.1 We couple the labelled 
tree T(V) and the 7T7v-projected INT, starting with a coupling of the position of the initial 
vertex V and ttn(— X), which fails with probability going to zero, by Lemma A. 2 



Again we apply the concept of an exploration process. As before we categorise vertices as veiled, 
if they have not yet been discovered, active, if they have been discovered, but if their descendants 
have not yet been explored, and dead, if they have been discovered and all their descendants have 
been explored. In one exploration step the leftmost active vertex is picked and its descendants 
are explored in increasing order with respect to the location parameter. We stop immediately 
once one of the events (A), (B) or (C) happens. Note that in that case the exploration of the 
last vertex might not be completed. Moreover, when coupling two explorations, we also stop in 
the adverse event (E) that the explored graphs disagree. In event (B), the parameters (ri]y)NeN 
are chosen such that 



lim 

AT->oo 



(log N log log N) c 

n N 



and 



lognjv 

hm — 

JV->oo log A 



0. 



for a := (1 — 7) 1 V3. Noting that we never need to explore more than cn vertices, we see 



from Lemma 6.2, Remark 6.4 and Lemma 6.6 that the probability of a failure of this coupling 
is bounded by a constant multiple of 



cjv(1 + /(cat) 2 ) \-c N — 

n N n N 



1-7 



< 



-JV 



+ 



cjv 



n N n N 



1-7 



0. 



Consequently, the coupling succeeds with high probability. As in Lemma 4.2 it is easy to see 
that, with high probability, event (B) implies that 



Hence we have 



#T(V) ^ c N and #T ^ c N . 



#T(V) V c N = #T V cn with high probability, 



and the statement follows by combining this with Proposition 5.1 



7 The variance of the number of vertices in large clusters 

In this section we provide the second moment estimate needed to show that our key empirical 
quantity, the number of vertices in connected components of a given size, concentrate asymp- 
totically near their mean. 
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Proposition 7.1. Suppose that (cn)n^n o-nd (n^^eN are sequences of integers satisfying 
1 ^ cn,un ^ N such that c 2 N n^ is bounded from above. Then, for a constant QtJ\ > 
depending on these sequences and on f, we have 



1 N 



< 2F(#C N (V) < c N and C N (V) D {1, . . . ,n N } ^ 0) + ^ + Cfen^r, 

where V is independent of Gn and uniformly distributed on { 1 , . . . , N} . 

Proof. Let v,w be two distinct vertices of Qn- We start by exploring the neighbourhood of v 
similarly as in Section 5. As before we classify the vertices as veiled, active and dead, and in the 
beginning only v is active and the remaining vertices are veiled. In one exploration step we pick 
the leftmost active vertex and consecutively (from the left to the right) explore its immediate 
neighbours in the set of veiled vertices only. Newly found vertices are activated and the vertex 
to be explored is set to dead after the exploration. We immediately stop the exploration once 
one of the events 

(A) the number of unveiled vertices in the cluster reaches cat, 

(B) one vertex in {1, . . . , n^} is activated, or 

(C) there are no more active vertices left, 

happens. Note that when we stop due to (A) or (B) the exploration of the last vertex might not 
be finished. In that case we call this vertex semi-active. 

We proceed with a second exploration process, namely the exploration of the cluster of w. This 
exploration follows the same rules as the first exploration process, treating vertices that remained 
active or semi-active at the end of the first exploration as veiled. In addition to the stopping 
in the cases (A), (B), (C) we also stop the exploration once a vertex is unveiled which was also 
unveiled in the first exploration, calling this event (D). We consider the following events: 

E v : the first exploration started with vertex v ends in (A) or (B); 

E^ : w is unveiled during the first exploration (that of v); 

E°2 W : w remains veiled in the first exploration and the second exploration ends in (A) or (B) 
but not in (D); 

E 3 ' : w remains veiled in the first exploration and the second exploration ends in (D). 

We have 

N N N N 3 



J2 E F (#c N (v) > c^, #c N ( W ) > CN ) ^ £ J2 E p ^ n E T) 

v=l w=l fc=l 

N 3 N 

En^)EE p ^n^) 



V = l MJ = 1 V=l W=l fc=l 

N 3 N 



(31) 



v=l k=l w=l 
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As the first exploration immediately stops once one has unveiled cn vertices, we conclude that 

N 

W(E V { W | E°) =^\Yj 1 ESl' m EV ) < °N- (32) 



N N 

E 

10=1 



To analyse the remaining terms, we fix distinct vertices v and w and note that the configuration 
after the first exploration can be formally described by an element £ of 

{open, closed, unexplored} N , 

where E^ := {(a, b) € {1, ... , N} 2 : i < j} denotes the set of possible edges. We pick a feasible 
configuration t and denote by £$. the event that the first exploration ended in this configuration. 
On the event £% the status of each vertex (veiled, active, semi-active or dead) at the end of the 
first exploration is determined. Suppose t is such that w remained veiled in the first exploration, 
which means that £$ and E x ' are disjoint events. Next, we note that 

F(E%' w \£ t ) ^F(E W ). (33) 

Indeed, if in the exploration of w we encounter an edge which is open in the configuration £, we 
have unveiled a vertex which was also unveiled in the exploration of v, the second exploration 
ends in (D) and hence E 2 ' does not happen. Otherwise, the event <?j influences the exploration 
of w only in the sense that in the degree evolution of some vertices some edges may be condi- 



tioned to be closed. By Lemma 2.9 this conditional probability is bounded by the unconditional 



probability and hence we obtain (33). 

Finally, we analyse the probability ¥(E^' W \ £%). If the second exploration process ends in state (D) 
we have discovered an edge connecting the exploration started in w to an active or semi-active 
vertex a from the first exploration. Recall that in each exploration we explore the immediate 
neighbourhoods of at most cm vertices. Let K £ En be a feasible configuration at the beginning 
of the neighbourhood exploration of a vertex n > tin and note that this implies every edge which 
is open (resp. closed) in fi is also open (resp. closed) in ^. Recall that £$ denotes the event that 
this configuration is seen in the combined exploration processes. We denote by a and s the set 
of active and semi-active vertices of the first exploration induced by t (or, equivalently, by $). 
Moreover, we denote by d the set of dead vertices of the combined exploration excluding the 
father of n, and, for a G aUs, we let U a denote the set of dead vertices of the ongoing exploration 
excluding the father of n, plus the vertices that were marked as dead in the first exploration at 
the time the vertex a was discovered. We need to distinguish several cases. 
First, consider the case a £ a with a < n. By definition of the combined exploration process, 
we know that a has no jumps in its indegree evolution at times associated to the vertices d a . If 
a was explored from the right, say with father in b, we thus get 

P(3 edge between a and n\ £&) 

/ s (34) 

= ¥(AZ[a,n- 1] = l\AZ[a,b- 1] = 1 and AZ[a,d-l] = OVdG 8 a ). 

If a was explored from the left, then 

P(3 edge between a and n\ £ A ) =P(A2[o,n-l] = l\AZ[a,d-l] =0VdG9 a ). (35) 

Second, consider the case a £ a with n < a. By definition of the combined exploration process, 
the indegree evolution of n has no jumps that can be associated to edges connecting to t). Hence, 
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(36) 



if n was explored from the right, say with father in b, then 

P(3 edge between a and n\ £#) 

= F(AZ[n,a-l] = l\AZ[n,b-l] = 1 and AZ[n,d-l] =0\/ded), 

and, if n was explored from the left, then 

P(3 edge between a andn|£#) =P(A2[n,o-l] = l| A2[n,d- 1] = OVd G t)). (37) 

Third, consider a £ 5 and denote by a' the last vertex which was unveiled in the first exploration. 
If a' > n then the existence of an edge between a and n was already explored in the first 



exploration and no edge was found. If a' < n < a, we find estimates (36), (37) again. If a < n 
and the father b of a satisfies b > a' V a, 

* k (AZ[a\/ a',n-l] = 1| 

(38) 



"(3 edge between a and n\ £&) ^ sup 

0<k<c N -l 



AZ[aV a',b- 1] = 1 and AZ[a V a',d- 1] = OVd G t) a ), 
and if a = v or the father b of a V a' satisfies b < aV a' , 
P(3 edge between a and nl £$) 



^ sup 

0<fc<cjv 



(A2[aVa',n-l] = l\AZ[aVa',d- 1] = OVci 6 t) a ). 



(39) 



Using first Lemma 2.12, then Lemma 2.10 and Lemma 2.11 we see that the terms (I34J) — ( 3T ) are 
bounded by 

C^ M ¥\AZ[a,n- 1] = l) < Cfem^f^ 
and similarly, the terms (38)-(|39|) are bounded by 

CfcniP Cjv (AZ[o,n-l] = l)<Cfen| 

Note that there are at most ctv vertices a £ a U s and at most one of those is semi-active. For 
each of these a we have to test the existence of edges no more than cjv times. Hence, using also 



n N 



Pl,n N f(c N ) 



Lemma 2.7 and the boundedness of f(n)/n, we find Qn\> such that 



(ii 3 \h ) ^ L^J2\C N 



n N 



+ Q2I2ICN 



Pl,n N f{c N ) 



< 



n N 



Cfen-r 



TV 



n 



TV 



Summarising our steps, we have 

1 N 
var(-^l{#CivW^Civ} 



U=l 



AT TV 



TV TV 



sC E 



N 2 



E ^ 1{#C* (v) > CJV , #CatH ^ CAT} - T72 E E P(^)P(^' 



TV 



« = 1 W) = l 



+ 2 — E p (#Ctv W < c N and C/v(v) n {1, ... ,n N } ^ 0) 



■u=l 



-TV 



< 2F(#C N (V) < c N and C N (V) n {1, . . . , n^} / 0) + ^ + Cfcl]^ 



''A? 



as required to complete the proof. 



□ 
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8 Proof of Theorem 11.7 



We start by proving the lower bound for C N . Suppose therefore that p(f) > 0, fix 5 > 



arbitrarily small and use Lemma 3.4 to choose e > such that the survival probability of 
/ = / — e is larger than p(f) — 5. We denote by (Gn)ngn a sequence of random networks with 
attachment rule / and let Cn(v) the connected component of v in Qn. Suppose a vertex V is 
chosen uniformly at random from {!,... ,N}. We choose cn '■= |_ log iV^/log log iVj and observe 



that by Proposition 6.1 

N 
r I • 

E 
Y 

v=l 



[^ E H^Cn^v) > c N }] = F{#C N (V) > c n } — ► P{#T = oo} ^ p(f) - 5, (40) 

v=l 

as N tends to infinity. By Proposition 7.1 with un '■= I (log iV) 1 -^ J , we have 

1 ^ 
var(-^l{#Cjv(«)^c JV }) 



^=1 



CAT „_ C^ 



< 2¥(#C N (V) < c N and C N (V) n {1, . . . , n w } / 0) + -£ + <^t] — 



V ^n^ 



The first summand goes to zero by Lemma 4.2 and so do the remaining terms by the choice of 
our parameters. Hence 

1 N 
liminf — V" l{#C N (v) ^ cat} ^ p(/) - 5 in probability, 

N-^oo iv z — ' 
«=1 



and Proposition 4.1 implies that, with high probability, there exists a connected component 



comprising at least a proportion p(f) of all vertices, proving the lower bound. 



To see the upper bound we work with the original attachment function /. In analogy to (40) 
we obtain 



i 
lim E\-y2l{#C N (v) > c N } =p(f). 

/->oo LiV * — ' J 



v=l 

As in the lower bound, the variance goes to zero, and hence we have 

N 



J im U y~l H#Cn(v) ^ cat} = p(f) in probability. 

v=l 

From this we infer that 

#C (1) c / 1 N 

limsup — -^- ^ lim sup -^ V ( — V] l{#C N (v) ^ cjv} ) < p(f) in probability, 

proving the upper bound. 

Finally, to prove the result on the size of the second largest connected component, note that we 
have seen in particular that 

1 N 
J im Tf Yl H#Cn{v) ^ cat} = p(f) in probability, 



N^oo N 

v=l 
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so that, with high probability, the proportion of vertices in clusters of size ^ cat is asymptotically 
equal to the proportion of vertices in the giant component. This implies that the proportion of 
vertices, which are not in the giant component but in components of size at least cat goes to 
zero in probability, which is a stronger result than the stated claim. 



9 Proof of Theorem 1.8 



We fix k G N and choose cn := k + 1. By Proposition 6.1 we have 
im — Ep 

N 



1 

lim -E[5~;i{#Ctf(v)0} 



u=l 



lim ¥{#C N (V) < k) = P{#T < k} 

N— >oo 



and Proposition 7.1 yields 



™Ql{#C N (v) < fc}) = va i^l{#C N (v) > c N } 



This implies the statement, as k is arbitrary. 



10 Proof of Theorem 1.5 



The equivalence of the divergence of the sequence in Theorem 1.5 and the criterion X = stated 



in (i) of Remark 1.6 follows from the bounds on the spectral radius of the operators A a given in 
the proof of Proposition 1.9 Moreover, it is easy to see from the arguments of Section [3] that the 
survival of the INT under percolation with retention parameter p is equivalent to the existence 
of < a < 1 such that 

p(pA a ) =pp{A a ) < 1. 



Hence, to complete the proof of Theorem 1 1 . 5| and Remark 1.6 it suffices to show that, for a fixed 
retention parameter < p < 1, the existence of a giant component for the percolated network 
is equivalent to the survival of the INT under percolation with retention parameter p. We now 



give a sketch of this by showing how the corresponding arguments in the proof of Theorem 1.7 
have to be modified. 



As in the proof of Theorem 1.7 the main part of the argument consists of couplings of the 
exploration process of the neighbourhood of a vertex in the network to increasingly simple 
objects. To begin with we have to couple the exploration of vertices in the percolated network 
and the percolated labelled tree, using arguments as in Section[5j We only modify the exploration 
processes a little: Whenever we find a new vertex, instead of automatically declaring it active, 
we declare it active with probability p and passive otherwise. We do this independently for each 
newly found vertex. We still explore at every step the leftmost active vertex, but we change 
the stopping criterion (El): we now stop the process when we rediscover an active or passive 
vertex. We also stop the process when we have discovered more than ^—^ cn passive vertices, 
calling this event (E3). All other stopping criteria are retained literally. 

By a simple application of the strong law of large numbers we see that the probability of stopping 



in the event (E3) converges to zero. The proof of Lemma 5.2 carries over to our case, as it only 
uses that the number of dead, active and passive vertices is bounded by a constant multiple of cjsr- 
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Hence the coupling of explorations is successful with high probability. Similarly, the coupling 
of the exploration processes for the random labelled tree and the idealised neighbourhood tree 
constructed in Section [6] can be performed so that under the assumption on the parameters 
given in Proposition |6.1[ we have 



#C* N (V) V c N = #T* V c N with high probability, 

where C* N {v) denotes the connected component in the percolated network, which contains the 
vertex v, and X* is the percolated INT. 

In order to analyse the variance of the number of vertices in large clusters of the percolated 



network we modify the exploration processes described in the proof of Proposition 7.1 a little: 
In the first exploration we activate newly unveiled vertices with probability p and declare them 
passive otherwise. We always explore the neighbourhood of the leftmost active vertex and 
investigate its links to the set of veiled or passive vertices from left to right, possibly activating 
a passive vertex when it is revisited. We stop the exploration in the events (A), (B), and (C) as 
before, and additionally if the number of passive vertices exceeds 2—^2 cat, calling this event (A'). 
As before, the probability of stopping in (A) goes to zero by the strong law of large numbers. 

The exploration of the second cluster follows the same rules as that of the first, treating vertices 
that were left active, semi-active or passive in the first exploration as veiled. In addition to the 
stopping events (A), (A), (B) and (C) we also stop in the event (D) when a vertex is unveiled 
which was also unveiled in the first exploration. This vertex may have been active, semi-active 
or passive at the end of the first exploration. We then introduce the event E v that the first 
exploration ends in events (A), (A) or (B), events E^' w and E%' as before, and event E 2 ' that 
w remained veiled in the first exploration and the second exploration ends in (A), (A) or (B). 
We can write 

N N N 3 N 

^rj>(#c^) > c N ,#c N (w) > CN ) < x>(^)EE nK ,w i n, 

v=l w=l v=l k=l w=l 

where C' N (v) denotes the connected component of v in the percolated network. The summand 
corresponding to k = 1 can be estimated as before. For the other summands we describe the 
configuration after the first exploration as an element 6 of 

{open, closed, removed, unexplored} N , 

where edges corresponding to the creation of passive vertices are considered as 'removed'. We 
again obtain that P^g'H £t) ^ ^(E w ) using the fact that if in the second exploration we ever 
encounter an edge which is open or removed in the configuration t the second exploration ends in 
(D) and E 2 ' w does not occur. Finally, the estimate of ¥(E^' W \ £j) carries over to our situation as 
it relies only on the fact that the number of unveiled vertices in the first exploration is bounded 
by a constant multiple of cm- We thus obtain a result analogous to Proposition |7.1| 

Using straightforward analogues of the results in Section [4] we can now show that the existence 
of a giant component for the percolated network is equivalent to the survival of the INT under 
percolation with retention parameter p using the argument of Section [8) This completes the 



proof of Theorem 1.5 
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A Appendix 

In this appendix we provide two auxiliary coupling lemmas. 

Lemma A.l. Let A ^ and p G [0,1], X w Poisson distributed with parameter X, and X {2) 
Bernoulli distributed with parameter p. Then there exists a coupling of these two random vari- 
ables such that 

¥{X {1) ^X {2) ) sC \ 2 + \\-p\. 

Proof. We only need to consider the case where A G [0,1]. Then X (1) can be coupled to 
a Bernoulli distributed random variable X with parameter A, such that P(AT (1) ^ X) = A — 
Ae _A ^ A 2 . Moreover, X and X (2) can be coupled such that P(X / X {2) ) = \p - A|. The two 
facts together imply the statement. □ 

Lemma A. 2. Let Y be standard exponentially distributed and X uniformly distributed on 
{1, . . . , N}. Then X and Y can be coupled in such a way that 

P(x^7T N (-Y))^qxM 1 -^, 

for the function itn defined at the beginning of Section [6| 
Proof. For 2 ^ k ^ N we have 



P(ttjv(- 


-Y) = k) = 


,N-1 
V j=k J 


N-l 

E 

j=k-i 


}) 


= exp { - 


N-l 1 

-E } 

j=k J 


- exp j - 


N-l 

E 

i=fc-i 


1 
3 


ich is bounded from above by 


















expj - 


JV ~ 1 1 

j=k-i j 


"1)< 


s( 


i + N-i) 


(1 + ^T 


+ (fc-1) 2 ) 


) 





and similarly from below. This gives \P(ttn(—Y) = k) — jj\ ^ Wr. Hence we can couple the 
random variables so that, for a suitable constant QX2I > 0, 

N j „ 

p(x / ^(-y)) < P(^(-y) = 1) + E l p (^(-^) = fc) - £| < qaa^p. □ 

fc=2 
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